Tuesday, November 17, 2009

Mark Logic adds XSLT support

Mark Logic currently lacks any support for XSLT... but that's about to change: Norm Walsh has announced upcoming support for it in version 5.

Monday, September 14, 2009

XML Schema 1.1 tutorials

Roger Costello has created a couple of good powerpoint presentations on XSD 1.1
One tutorial is for developers, the other for managers which contains more general wordy descriptions of the benefits of 1.1

Friday, August 22, 2008

Some sample templates for use with LexEv

If your XML has been parsed using LexEv, here are some sample templates for handling the LexEv markup.

To output an entity reference:


<xsl:template match="lexev:entity">
<xsl:value-of disable-output-escaping="yes" select="concat('&amp;', @name, ';')"/>
</xsl:template>


To process a CDATA section as markup:


<xsl:template match="lexev:cdata">
<xsl:apply-templates/>
</xsl:template>


To output a DOCTYPE from the processing instructions:

In XSLT 1.0 the doctype-public and doctype-system attributes on xsl:output are static and need to be known at compile time, which means I'm afraid you have to do this:


<xsl:template match="/">
<xsl:value-of disable-output-escaping="yes"
select="concat('&lt;!DOCTYPE ', name(/*), '&#xa; PUBLIC &quot;',
processing-instruction('doctype-public'), '&quot; &quot;',
processing-instruction('doctype-system'), '&quot;&gt;')"/>
<xsl:apply-templates/>
</xsl:template>


In XSLT 2.0 you can use xsl:result-document where the doctype-public and doctype-system are AVTs which mean their values can be determined at runtime:


<xsl:template match="/">
<xsl:result-document
doctype-public="{processing-instruction('doctype-public')}"
doctype-system="{processing-instruction('doctype-system')}">
<xsl:apply-templates/>
</xsl:result-document>
</xsl:template>

Thursday, August 21, 2008

LexEv XMLReader - converts lexical events into markup

It's often a requirement to preserve entity references through to the output (which are usually lost during parsing) or to process the contents of CDATA sections as markup. The Lexical Event XMLReader wraps the standard XMLReader to convert lexical events into markup so that they can be processed. Typical uses are:

  • Converting cdata sections into markup:


    <![CDATA[ &lt;p&gt; a para &lt;p&gt; ]]>

    to:

    <lexev:cdata> <p> a para </p> </lexev:cdata>



  • Preserving entity references:


    hello&mdash;world

    is converted to:

    hello<lexev:entity name="mdash">—</lexev:entity>world


  • Preserving the doctype declaration:


    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

    is converted to processing instructions:

    <?doctype-public -//W3C//DTD XHTML 1.0 Transitional//EN?>
    <?doctype-system http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd?>


  • Marking up comments:


    <!-- a comment -->

    is converted to:

    <lexev:comment> a comment </lexev:comment>


To use LexEvXMLReader with Saxon:


java -cp saxon9.jar;LexEvXMLReader.jar net.sf.saxon.Transform -x:com.andrewjwelch.lexev.LexEvXMLReader input.xml
stylesheet.xslt


Make sure LexEvXMLReader.jar is on the classpath, and then tell Saxon to use it with the -x switch (copy and paste this line -x:com.andrewjwelch.lexev.LexEvXMLReader)


To use LexEvXMLReader from Java:

XMLReader xmlReader = new LexEvXMLReader();


You can control the following features of LexEv:


  • enable/disable the marking up of entity references

  • enable/disable the marking up of CDATA sections

  • set the default namespace for the CDATA section markup

  • enable/disable the reporting of the DOCTYPE

  • enable/disable the marking up of comments


You can set these through the API (if you are including LexEv in an application), or from the command line using the following system properties:


  • com.andrewjwelch.lexev.inline-entities

  • com.andrewjwelch.lexev.cdata

  • com.andrewjwelch.doctype.cdataNamespace

  • com.andrewjwelch.lexev.doctype

  • com.andrewjwelch.lexev.comments


For example to set a system property from the command line you would use: -Dcom.andrewjwelch.lexev.comments=false


For support, suggestions and licensing, email lexev@andrewjwelch.com

Friday, July 18, 2008

Kernow 1.6.1

Kernow 1.6.1 (beta) is now availble both as a download and via web start.

Notable things in this release:

- Line numbers on the editor panes in the sandboxes (thanks to a new version of Bounce). You might not think so, but getting line numbers down the side of the editor pane is really involved. It's like block indenting (pressing tab or shift-tab when a block of text is selected) in that it's very low level and requires a lot of coding. Why it's not an intergral part of the editor pane I don't know...

- Improved the syntax-checking-as-you-type and highlighting, and added the ability to disable it.

- The output area is now also a JEditorPane using Bounce so it supports tag highlighting. This might slow things down because now it's an HTML document where every addition is inserted at the end of the document, instead of just appending to a JTextArea... if this proves to be A Bad Thing I'll revert it back to a plain old text area with plain text.

- You can now select which tabs are visible (in options -> tabs) so if you never use certain tabs (like Batch or Schematron) you can remove them.

- If you have Saxon SA you can use XML Schema 1.1 (options -> validation)

- Improved the parameters dialog to make it less fiddly to enter params

- Slight graphical tweaks and likely other things that I've forgotten...

Thursday, July 17, 2008

The Nimbus Look and Feel

This is "Nimbus" - the new look and feel that comes with Java 6 Update 10. This is a cross platform l&f which means it should look the same on all platforms. Kernow currently uses the "platform default" look and feel so it should look like a native app on the platform it's run on, but it's hard to make sure it looks right - often what looks ok on Windows will have obscured buttons on Linux... something I should've fixed but never did.

Anyway, what do you think?

Friday, July 11, 2008

Validating co-constrains in XML Schema 1.1 using xs:alternative

Rather than mess around with loads of assertions to check your co-constraints, XML Schema 1.1 introduces the xs:alternative instruction which allows you to change the type used to validate the element based on some condition. Instead of defining one type and then adding assertions to check the variations, just define one type per variation, then assign that type based on the condition.

To do this you first have to define a default type, then define types for each variation by restricting that type. To choose between them, use xs:alternative as a child of xs:element. Here's an example of a co-constraint - this and that are allowed based on the value of the type attribute of node - and how to validate it:
<root>
<node type="A">
<this/>
</node>
<node type="B">
<that/>
</node>
</root>

Here's the schema:
<xs:schema 
xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">

<xs:element name="root" type="root"/>
<xs:element name="node" type="node">
<xs:alternative type="node-type-A" test="@type = 'A'"/>
<xs:alternative type="node-type-B" test="@type = 'B'"/>
</xs:element>

<xs:element name="this"/>
<xs:element name="that"/>

<xs:complexType name="root">
<xs:sequence>
<xs:element ref="node" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>

<-- Base type -->
<xs:complexType name="node">
<xs:sequence>
<xs:any/>
</xs:sequence>
<xs:attribute name="type" type="allowed-node-types"/>
</xs:complexType>

<xs:simpleType name="allowed-node-types">
<xs:restriction base="xs:string">
<xs:enumeration value="A"/>
<xs:enumeration value="B"/>
</xs:restriction>
</xs:simpleType>

<-- Type A -->
<xs:complexType name="node-type-A">
<xs:complexContent>
<xs:restriction base="node">
<xs:sequence>
<xs:element ref="this"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>

<-- Type B -->
<xs:complexType name="node-type-B">
<xs:complexContent>
<xs:restriction base="node">
<xs:sequence>
<xs:element ref="that"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:schema>

I really like this... schema 1.1 will be a joy to use.