So there’s been a lot of discussion lately around the new W3C HTML Working Group, it’s new charter, it’s chair and so on and so forth. There have also been posts about some of the new attributes. The W3C says they’ll work with the WHATWG (which is really a group of browser vendors, minus Microsoft) and work to move forward on XHTML and HTML 5, commonly being referred to as (X)HTML5 or Web Applications 1.0. What a lot of people don’t realize is that the specs moving forward are intended to enhance both languages, especially with XHTML 2 just about being considered dead on the vine. They reference both HTML 5 and XHTML5 (never mind 2, 3, and 4 -- not that there will be any).
Appendix C of XHTML
Several things are worth discussing in their current working drafts… but one thing I noticed while looking over the WHATWG’s current specs are that moving forward, is that XHTML cannot be served as text/html:
“XHTML documents (XML documents using elements from the HTML namespace) that use the new features described in this specification and that are served over the wire (e.g. by HTTP) must be sent using an XML MIME type such as application/xml or application/xhtml+xml and must not be served as text/html.”
Traditionally, following Appendix C of the XHTML spec, you could serve XHTML Strict as text/html to HTML “User Agents”. Now a lot of people disagree about that, of course, and that’s not where I’m going.
XHTML Compatibility is Invalid HTML
Truth be told, if you’re serving XHTML within the guidelines of Appendix C, what you’re actually serving up is invalid HTML 4.01.
The space added before self closing tags in XHTML for Web browsers isn’t part of XML, it’s a mis-interpreted unknown HTML Element Attribute being read as “/”. It might as well say “blah”, which is tossed out.
Now, this has become a religious war online between so called purists on both sides, and it’s something I try to stay away from – honestly I think both sides have a point.
Honestly though, somewhere inside I’ve believed in Appendix C from a certain interpretation which I don’t hear very often, but if I’m wrong, please let me know:
- You’ve created a document.
- It’s labeled as XHTML.
- It’s served as HTML. So what, that’s what browsers support.
- It is rendered and understood in an “HTML User Agent” (browser) as HTML in memory at run time at that moment, exploiting HTML’s generous error handling.
- It’s valid XHTML (XML) based on its structure, it’s label, and content.
- That document may even be dynamic, coming out of a CMS, so you want the fragments of what’s pushed into it be valid XHTML.
- That document and the fragments from the CMS may not only be used in a Web browser.
- Parts of the content may need to parsed by other systems from the File System, or with other tools, which may demand XML compliance and rules.
- This may happen today, tomorrow, next week, or even next year.
So there you go. It’s not only XHTML and XML for the browser, it’s XML for other applications, uses, and purposes. The fact that people focus on the way it’s served to the browser seems short-sighted to me. It’s not the only place this document winds up, or may end up over time. Let’s hear it for a flexible specification.
Browser Manufacturers and XHTML
However, browser manufacturers got together and formed the WHATWG, and those very same browser manufacturers don’t believe in Appendix C. They don’t want you to serve your documents that way, they say it’s broken and stupid. Maybe it is, but guys… browsers aren’t the only tools using these documents. Maybe there’s other approaches, maybe there’s other tools and software which can change XHTML to HTML and back and forth, but there’s extra overhead there, and the way I see it, the marketplace and the industry are in a phase of transition. It takes time.
Backwards Compatibility and WHATWG
The WHATWG was started because they claimed the W3C wasn’t in tune with the people. And honestly, they were not, and they were right. But one of the mantras repeated over and over in WHATWG discussions is to retain backwards compatibility.
I say they’re already breaking that with notion with concepts such as predefined classes to drive various types of functionality within the browser – which says to me you’re going to break a lot of existing apps.
Also what about Microformats? Bottom line, there’s people setting up ways of doing things online without these standards bodies, and sometimes I think both the W3C and the WHATWG need to listen a little bit more. Not everyone out here with an opinion has tons of R&D time and can join those mailing lists. I think they need better ways of communicating with the masses and soliciting feedback.
Fortunately the specs aren’t set yet, and now’s the time to start to talk about it.
Links of interest:
- WHATWG wiki on differences between HTML and XHTML
- WHATWG Spec on Web Applications 1.0
- Surfin' Safari on XHTML, HTML, XML
- XHTML Considered Harmful to Feelings
- Anne Vankesteren on Invalid HTML from XHTML
- Appendix C of XHTML
Update 2007-01-25 AM: Apologies to Tantek Celik for initially having his name in here as "Celik Tantek" -- I have *no* idea what happened there, that was some sort of editor snafu as I was linking things up...
Possibly Related Articles
commenting closed for this article