blog.humaneguitarist.org
pixelation: custom XSLT functions with Python and lxml
[Fri, 02 Nov 2012 21:28:57 +0000]
I'll be brief. Because the Python "lxml" module doesn't support XSLT 2.0 functions, I was looking at support for EXSLT [http://www.exslt.org/] ... ... but then stumbled on how to write my own functions and call them from stylesheets. Freakin' cool. I like calling it "pxslt" for "Python XSLT" and pronouncing it like "pixelate". :P Example below of the "module" I made; the script that calls it, and the results. Told you I'd be brief. Module:
#pxslt.py
def underscore(context, word):
'''Replace whitespace with underscore.'''
out = word[0].replace(' ', '_')
return out
def multiply(context, int_val, int2_val):
'''Multiply two integers.'''
int_val, int2_val = int(int_val[0]), int(int2_val[0])
return int_val * int2_val
def libraryThing(context, isbn):
'''Get language for a work based on ISBN using LibraryThing API.'''
isbn = isbn[0]
import urllib
res = urllib.urlopen('http://www.librarything.com/api/thingLang.php?isbn=' + isbn)
res_r = res.read()
return res_r
##### DO NOT EDIT
##### makes it possible to call the above functions with XSLT
def pxslt():
myFunctions = []
gbs = globals()
from inspect import isfunction
for gb in gbs:
if isfunction(gbs[gb]) and gb != 'pxslt':
#print gb
myFunctions.append(gbs[gb])
from lxml import etree
#see: http://lxml.de/extensions.html
ns = etree.FunctionNamespace('file://libs/pxslt.py')
ns.prefix = 'pxsl'
for myFunction in myFunctions:
name = str(myFunction.func_name)
ns[name] = myFunction
return ns
Usage example:
from lxml import etree
#####
myXML = etree.XML('''\
<a>
<b>Hello. This will appear with whitespaces replaced by underscores.</b>
<c>3</c>
</a>''')
myXSL = etree.XSLT(etree.XML('''\
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:pxslt="file://libs/pxslt.py">
<xsl:output method="text" version="1.0" />
<xsl:template match="a">
<xsl:variable name="isbn">9955081260</xsl:variable>
<xsl:value-of select="pxslt:libraryThing($isbn)" />
<xsl:text>\n</xsl:text> <!-- Python will line break here -->
<xsl:value-of select="pxslt:underscore(b/text())" />
<xsl:text>\n</xsl:text> <!-- Python will line break here -->
<xsl:call-template name="mathFunc">
</xsl:call-template>
</xsl:template>
<xsl:template name="mathFunc">
<xsl:variable name="myNum">10</xsl:variable>
<xsl:value-of select="pxslt:multiply(c/text(), $myNum)" />
</xsl:template>
</xsl:stylesheet>'''))
import pxslt
pxslt.pxslt() #get all set up with namespaces and function stuff
print(myXSL(myXML))
#myXSL_file = etree.XSLT(etree.parse('foo.xsl')) #for testing with a real XSL file
#print(myXSL_file(myXML))
Output:
>>
lit
Hello._This_will_appear_with_whitespaces_replaced_by_underscores.
30
___________________________________________________________________________
Updated Example for Python 3.6 per Sathish's comment of November, 2019: Module:
#!/usr/bin/python3
# pxslt.py
import os
from lxml import etree
NS_URI = "file://" + os.path.basename(__file__)
# For info in custom functions in lxml, see: http://lxml.de/extensions.html
def pxslt(funks):
""" @funks is a list of function. """
namespace = None
for funk in funks:
namespace = etree.FunctionNamespace(NS_URI)
namespace.prefix = "pxslt"
namespace[funk.__name__] = funk
return namespace
Usage Example:
#!/usr/bin/python3
import pxslt
from lxml import etree
# create some custom functions to use in XSLT.
def underscore(context, word):
return word[0].replace(" ", "_")
def multiply(context, x, y):
return int(x[0]) * int(y[0])
# create some XML to alter via XSLT.
myXML = """\
<a>
<b>Hello. This will appear with whitespaces replaced by underscores.</b>
<c>3</c>
</a>"""
XML = etree.XML(myXML)
# create some XSLT that calls the custom functions.
myXSL = """\
<xsl:stylesheet version="1.0" xmlns:pxslt="{NS_URI}" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" version="1.0">
<xsl:template match="a">
<xsl:value-of select="pxslt:underscore(b/text())">
<xsl:text>\n</xsl:text> <!-- Python will line break here -->
<xsl:call-template name="mathFunc">
</xsl:call-template>
</xsl:value-of></xsl:template>
<xsl:template name="mathFunc">
<xsl:variable name="myNum">10</xsl:variable>
<xsl:value-of select="pxslt:multiply(c/text(), $myNum)">
</xsl:value-of></xsl:template>
</xsl:output></xsl:stylesheet>""".format(NS_URI=pxslt.NS_URI)
XSL = etree.XSLT(etree.XML(myXSL))
# update etree so custom functions are available.
pxslt.pxslt([underscore, multiply])
print(XSL(XML))
COMMENTS
Thanks nitin for prompt response :-)
Hi Sathish, This was for Python 2. "func_name" is deprecated. I've updated the code and example for Python 3.6 The old module code was gobbling up functions from globals and that's a terrible idea. The new demo code now makes you pass in a list of functions. Regardless, you really shouldn't use this code in production. It's better to refer to the usage in https://lxml.de/extensions.html - note the use of decorators.
Hi Nitin, this is exactly what am looking for... I am new to Python... will this code work with Python 3.6 as well..? I tried your code exactly as such, it gave me error as as follows, name = str(myFunction.func_name) AttributeError: 'function' object has no attribute 'func_name' then I removed the func_name and tried it, but didn't get any result. Its saying the message as, print(myXSL(myXML)) File "src/lxml/xslt.pxi", line 600, in lxml.etree.XSLT.__call__ lxml.etree.XSLTApplyError: XPath evaluation returned no result. Please advice whats going wrong with copy + paste + execute of your code.
Hi Rick, I don't use XSLT that much anymore, but if 2.0 is really needed, there's always calling something like Saxon via Java/command line if that's acceptable for your workflow.
This is fascinating stuff. I love XSLT and XPath and am relatively new to Python. I have gotten spoiled by XSLT/XPath 2.0 and am disappointed that lxml only supports 1.0. But I like the fact that I can "augment" 1.0 by calling Python functions. I am looking forward to playing with this and seeing if I can make it work in my scripts. I hope XSLT/XPath 2.0 support gets added to lxml or some other Python library. I need to do some fancy grouping and I sure would like to use 2.0's grouping features.