Thursday, January 31, 2008

Parsing XML into Java 5 Enums

Often when parsing XML into pojos I just resort to writing my own SAX based parser. It can be long winded but I think gives you the greatest flexibility and control over how you get from the XML to objects you can process.

One example is with Java 5's Enums, which are great. Given the kind of XML fragment where currencies are represented using internal codes:

<currency refid="001"/> <!-- 001 is Sterling -->
<currency refid="002"/> <!-- 002 is Euros -->
<currency refid="003"/> <!-- 003 is United States Dollars -->


You can represent each currency element with an Enum, which contains extra fields for the additional information:

public enum Currency {

GBP ("001", "GBP", "Sterling"),
USD ("002", "EUR", "Euros"),
USD ("003", "USD", "United States Dollar");

private final String refId;
private final String code;
private final String desc;

Currency(String refId, String code, String desc) {
this.refId = refId;
this.code = code;
this.desc = desc;
}

public String refId() {
return refId;
}

public String code() {
return code;
}

public String desc() {
return desc;
}

// Returns the enum based on it's property rather than its name
// (This loop could possibly be replaced with a static map, but be aware
// that static member variables are initialized *after* the enum and therefore
// aren't available to the constructor, so you'd need a static block.
public static Currency getTypeByRefId(String refId) {
for (Currency type : Currency.values()) {
if (type.refId().equals(refId)) {
return type;
}
}

throw new IllegalArgumentException("Don't have enum for: " + refId);
}
}


Notice how each enum calls its own contructor with the 3 parameters - the refId, the code, and the description.

You parse the XML into the enum by calling Currency.getTypeByRefId(String refId) passing in the @refid from the XML. The benefit of using the Enum is that you can then do things like:

if (currency.equals(Currency.GBP))

which is nice and clear, while at the same time being able to call currency.refId() and currency.desc() to get to the other values.

The drawback is that because static member variables are initialized after the enum, you can't create a HashMap and fill it for a faster lookup later (unless you use a static block). Instead you have to loop through all known values() for the enum given a refId. Although it feels wrong to loop, the worst case is only the size the of enum so I don't think it's too bad.

No comments: