At work as part of our XML data feed product we end up doing XSLT transformations on XML files starting a a few MB all the way up to several hundred MB. Using the 1.4.2 JDK it took over 4 days to do the XSL transformation on the largest file so over lunch one day we started brainstorming our troubleshooting approach. When we tried it on the 1.5 JDK the same transformation took a couple of minutes as opposed to 4 days. Talk about an improvement!
However, I was thrown astray of the solution because we include a specific version of Xalan in the classpath so I assumed it was the JDK version that was the differentiator but when I ran it by Naresh he reminded me that Xalan is included in the JDK.
If you’re using JAXP on a 1.5 JDK (where Xalan is now under com.sun.org.apache) or a 1.4 JDK (where Xalan is under org.apache) then the only way to override the version of Xalan is to put it before rt.jar in the bootclasspath. Sure enough, there appears to be some issue with Xalan in the 1.4.2 JDK where large XSL transformations are extremely slow!
Here’s what it looks like to get Java to use your own version of Xalan:
java -Xbootclasspath:xalan-2.X.X.jar:$JAVA_HOME/jre/lib/rt.jar …
We used xalan as well with the jdk 1.4 and i never had any problem with performance. I have to admit though my files were only 50 to 80 Mb. Have you tried working with Xerces?
Come to think of it, i don’t think it’s the java xml module. It’s probably the new jvm bundle with jdk 1.5. It’s supposed to be faster.