November 2008 Blog Posts

W3C XHTML 1.0 Appendix C: This appendix summarizes design guidelines for authors who wish their XHTML documents to render on existing HTML user agents.

W3C XHTML Media Types: The use of 'text/html' for XHTML SHOULD be limited for the purpose of rendering on existing HTML user agents, and SHOULD be limited to XHTML1 documents which follow the HTML Compatibility Guidelines. In particular, 'text/html' is NOT suitable for XHTML Family document types that adds elements and attributes from foreign namespaces, such as XHTML+MathML.

Mark Pilgrim: Despite Chris Wilson's assertion that "we cannot definitively say why XHTML has not been successful on the Web," I think it's pretty clear that Internet Explorer's complete lack of support for the application/xhtml+xml MIME type has something to do with it.

Sam Ruby: what is wrong with using XML for this?  Come on.  I can answer that with two words: IE, and Postel.

Chris Wilson (back in 2005): I made the decision to not try to support the MIME type in IE7 simply because I personally want XHTML to be successful in the long run.  I love XHTML (go look, my name is in the credits for XML 1.0); it’s capable of being truly interoperable if done right.

Internet Explorer doesn’t support XHTML served with the application/xhtml+xml MIME type and IE8 won’t add that support either. I wasn’t around when the feature decisions were taken for IE8 but I do know that the wish list is long and it only makes sense to implement a reasonable number of features in each release of any software product. It also makes sense to focus your resources on things that will have the most impact for the most customers.

Internet Explorer is an interesting product because it has many different audiences with each with different priorities. People browsing the web want it to be fast, easy-to-use, and safe. Web developers want good tools and a consistent experience across browsers. Enterprise IT managers want reliability, ease-of-deployment, and most-of-all compatibility with the applications that worked with previous versions and that run their business. Application developers want compatibility and also extensibility/customisation either of the browser itself or of the browser control they include in their applications. It’s hard to please all of the people all of the time, especially when you have hundreds of millions of users.

At the PDC in October, Alex Mogilevsky presented on the new rendering engine in IE8 that provides interoperable support for CSS 2.1. The IE team decided that CSS 2.1 support was a “must have” feature for IE8 and Alex describes the huge amount of work involved in making that happen (of interest, he also covers some of the history of the way IE worked in previous versions). With everything else that was a priority for IE8 XHTML wasn’t high enough up on the list and it didn’t make it into IE8. Will it make it into the next version of IE? Who knows. Those decisions certainly haven’t been taken yet.

With all the recent discussion about well-formedness I was wondering about what the web would be like if there _were_ broad support for XHTML. What percentage of new content would be served with application/xhtml+xml? Would all the Classic ASP and PHP developers try to use XHTML but then get frustrated at the fragility of pages that have to be perfectly marked-up? Would there be a gradual move of applications towards XHTML or would it be reserved for the elite minority? Is showing nothing for a slightly incorrect page the right answer for most web developers?

Jonas Sicking: When Netscape decided to rewrite their browser engine and use what has
become gecko (the engine used by firefox), one of the biggest problem with taking marketshare was compatibility with existing pages, even though the new engine was perfectly able to parse HTML 4 by spec.

In fact, we can still see this today. While firefox now has a worldwide marketshare of about 20%, our marketshare in many countries in Asia is tiny. Our market research data has shown that the main reason for that is website compatibility. Even though Firefox parses valid HTML4 very well.

Compatibility is extremely important and the W3C HTML 5 working group is putting a huge amount of effort into describing the interoperable way to process HTML documents that are not well-formed. Meanwhile, XHTML guarantees that pages will only be displayed if they are well-formed – there is no error recovery. I ask the question again: what valuable properties does a well-formed document have?

Of course, XHTML isn’t just about well-formedness. The documents are truly XML and have the ability to embed mark-up from other namespaces such as SVG and MathML and to be processed with standard XML tools. There is also work underway in the HTML 5 WG to define the mechanisms for these languages to be embedded in HTML. What value is there in XHTML if it only differs from HTML in which type of parser I instantiate to read a document?

Technorati Tags:

Sam Ruby: Whenever I find myself updating a script I wrote months or even years ago, these days my first step is to do a git init.

Over the last few months, I’ve seen quite a few people commenting about how git is a “better” source control system than Subversion. Today I watched this video as Linus Torvalds talked at Google about the ideas behind the creation of git. It’s both entertaining and informative and I think now I understand what they meant.

Technorati Tags: ,

There’s an interesting debate going on in the W3C HTML working group about whether well-formed HTML is important in the specification process for HTML5. It feels to me somehow intellectually that well-formedness is a valuable goal but when it comes down to explaining why it matters I’m finding it hard.

Which of the following is “better”:

normal<b>bold<i>bolditalic</b>italic</i>normal

or

normal<b>bold<i>bolditalic</i></b><i>italic</i>normal

The first is shorter (and works in all the popular web browsers) while the second is well-formed. Well-formedness isn’t about being smaller. It’s also not about performance: it turns out that the parsers in browsers often process certain non-well-formed mark-up faster than if it had been well-formed.

Since browsers have to parse both alternatives and the HTML5 process is about ensuring that they do so in a predictable and interoperable way then should there be any weight behind well-formed documents? After all, the spec doesn’t prevent you from choosing to be well-formed if you want to.

The analogy I’ve been considering is about indentation in C++ source code: few people would probably write C++ without a sensible indentation strategy to help make the code readable. Yet the C++ spec doesn’t need to say anything about indentation – it’s a best practice but not a formal part of the language definition. Could writing well-formed HTML be a best practice that’s not a formal part of the language definition?

Technorati Tags: ,

I’ve been spending some time on a side project to create a Vista boot image that installs unattended and configures a development machine with all the tools I want configured they way I want them. The goal is to install everything including Visual Studio and all the little utilities that I use frequently as well as configuring the common things the way I like them like the mouse sensitivity, colour scheme, etc.

The simple starting point for this project is to get Vista to install unattended using an answer file created with the Windows Automated Installation Kit (Windows AIK). I found an excellent guide to creating the answer file: FireGeier’s Unattended Vista Guide. This provides a simple walkthrough of how to configure the AIK and how to build increasingly complex unattended installs.

Technorati Tags: ,

I finally got around to buying a proper chair to sit on to use my PC at home.

Air Mesh Fabric Executive Chair

My hope is that now it is more comfortable to sit here I might actually get around to reading RSS feeds and even posting to my blog. :)