Well formed validation of XHTML pages

Validating XHTML pages is an interesting subject because no high traffic site I’ve ever tried it on actually successfully validates XHTML against a W3C validator. Most folks I know take it for granted that it’s unrealistic to write large sites with 100% W3C validating XHTML. However, that presents a real problem on the software engineering side where we want to be able to rely on our development environment or continuous integration to tell us whether everything is truly working and valid! For example with continuous integration we can run unit tests, functional tests, checkstyle on the code, compile the JSP’s, etc… but we can’t validate our XHTML because it’s unrealistic. So for the time being at work we’ve settled for validating well-formedness by using JSPX and TAGX for templating.

JSPX or TAGX serve an important purpose in that they will fail to compile if the XHTML is not well formed XML. If you forget to close a tag or close tags out of order you know right away! Realistically I think well-formedness of HTML templates is the best we can do right now given 100% XHTML validation seems unrealistic, otherwise, it would be great to build XHTML validation of pages into the continuous integration process.

I’m always bummed that other templating languages such as Tapestry’s templates, Ruby’s RHTML, Perl’s Mason, Python’s Zope, etc… don’t offer validation of well-formedness. JSP is the only templating language that I’m aware of that will tell you if the HTML you’ve written is well-formed XML. That’s too bad, since I really prefer the purity of a Tapestry or Enhydra’s XMLC templating approach over the JSP, RHTML, etc… approaches. Oh well, you can’t have it all!

This entry was posted in Java, Software Engineering, Web. Bookmark the permalink.

One Response to Well formed validation of XHTML pages

  1. It’s an interesting point you’ve raised here and one that I’ve also wondered about. I used to work for a CRM firm where I developed and maintained a fair few reporting extranets. There it was considered vital that all of pages where W3C compliant, as we were creating sites on behalf of a third party. It looked good for the company if the marketing team could promote our products by stating that all of our sites were compliant.

    It meant that I picked up a lot of good XHTML habbits, but apart from being able to slap a W3C logo on every site, it didn’t bring any additional benefit to the company. All of our pages were viewed from PCs or laptops, and since a lot of the data was very graphical, it wasn’t suitable for screenreaders, PDAs or mobile phones.

    Since I left the company and changed sector, W3C compliant code is considered a ‘nice to have’. You’re very right, it should be easier to produce templates and validate code for well-formedness, but I think for many large companies the benefits don’t really out-weigh the additional development time.

    That said, it probably won’t be long before sites that aren’t compliant are considered less valuable. Google already indexes on this basis: when I made an older version of my web site XHTML and CSS compliant, my rating shot up!

Comments are closed.