Thereâ€™s an interesting debate going on in the W3C HTML working group about whether well-formed HTML is important in the specification process for HTML5. It feels to me somehow intellectually that well-formedness is a valuable goal but when it comes down to explaining why it matters Iâ€™m finding it hard.
Which of the following is â€œbetterâ€:
The first is shorter (and works in all the popular web browsers) while the second is well-formed. Well-formedness isnâ€™t about being smaller. Itâ€™s also not about performance: it turns out that the parsers in browsers often process certain non-well-formed mark-up faster than if it had been well-formed.
Since browsers have to parse both alternatives and the HTML5 process is about ensuring that they do so in a predictable and interoperable way then should there be any weight behind well-formed documents? After all, the spec doesnâ€™t prevent you from choosing to be well-formed if you want to.
The analogy Iâ€™ve been considering is about indentation in C++ source code: few people would probably write C++ without a sensible indentation strategy to help make the code readable. Yet the C++ spec doesnâ€™t need to say anything about indentation â€“ itâ€™s a best practice but not a formal part of the language definition. Could writing well-formed HTML be a best practice thatâ€™s not a formal part of the language definition?