I wrote a stylesheet to do this in XSLT 2.0 with heavy use of extensions (to perform the transforms and execute the XPaths) which was nice from an academic standpoint, but it soon became clear that this would be more useful as a Java app runnable from Ant.
This little project grew into a way of checking any XML file (to check a transform it runs the transform first and then checks the result). I'm provisionally calling this "CheckXML" and it's still early days but I think it's got the potential to be something really good.
CheckXML will allow you perform various checks on an XML file - XML Schema, XPath 2.0, XSLT 2.0, XQuery and Relax NG. This allows users to augment schema checks with XPath/XSLT checks to fully check the correctness of an XML file. A sample CheckXML configuration file would be:
<checkXML>
<xml src="SampleXML.xml">
<check>SampleXSD.xsd</check>
<check>count(distinct-values(//@id)) = count(//id)</check>
<check>SampleXSLT.xslt</check>
</xml>
</checkXML>
Here the XML file "SampleXML.xml" first has the XML Schema SampleXSD.xsd applied to it, the then the XPath in the second check and finally the XSLT check. The CheckXML app will run each check and look for a result of "true" - any value that isn't true will be reported as a fail. This is the crucial point - it offloads the reponsbility of a creating a useful error to the check writer, and because the check write has the full power of XSLT/XPath/XQuery the error message can be as detailed as necessary (XSD/RNG check will return the error message if the XML isn't valid).
For example, modifying the XPath above to return a helpful message could be like this:<check>if (count(distinct-values(//@id)) = count(//@id)) then 'true' else concat('The following id is not unique: ', distinct-values(for $x in //@id return $x[count(//@id[. = $x]) > 1]))</check>
This would output "The following id is not unique: 123" if you had two @id's with the value "123". The XPath's can get pretty complicated pretty quickly, which is why most of them should be moved into XSLT files, but it still might be more convenient to just use the XPath in the check config file itself. As the check writer has full control over the return message, it can be as simple or complicated as needed.
The CheckConfig files can have multiple <xml> elements (<transform> elements for checking transforms) with each one having as many <check>'s as required. A CheckSuite will point to multiple CheckConfig files. CheckXML will be callable from Ant, with any fails causing the Ant build to fail (with all details of the fail in the logs). I'm also planning a GUI with a nice green/red progress bar :) and a CheckConfig editor, but that's down the line.
1 comment:
Hi Andrew
Interesting thoughts. I like the idea to check a whole XSLT transformation.
You may be interested by my own project of unit testing for XSLT 2.0. It is intended to check instead little pieces of code (actually, any sequence constructors, so you could test also whole transformations). This is a work in progress, but I still use it in my work.
Regards,
--drkm
Post a Comment