XHTML Considered Harmful

One of the current religions floating around the web is XHTML Compliance. If you post a technique or article that is not XHTML compliant, a horde of XHTMLites will descend on you. Many a blogger has trembled in fear before pressing the submit button on a post that might offend XHTML sensibilities. Evil are those who do not XHTML validate before the W3C!

This has to end, primarily because XHTML is a useless technology that is retarding progress, especially when it comes to Dymamic HTML (DHTML) and AJAX applications.

Here are my top eight reasons on why XHTML is Considered Harmful:

1. document.write is not supported

document.write is an extremely important tool in the professional DHTML programmers tool chest. If you create AJAX frameworks that require the page to have certain elements to work, for example, such as hidden iframes and so on, doing a document.write on page load can make it much easier for third-party developers to use your framework without having to know the specific elements you need to get things working.

Further, dccument.write is sometimes needed for obscure, but important, hacks to get things like DHTML history going.

Would it be nice to not have to need document.write? Of course. Does the real state of the web require it? Currently it does.

2. iframes are not supported

The W3C purists don't believe in web based applications or compound documents. This means they don't think about important things like iframes when it comes to building web applications, not just simple web pages.

XHTML drops iframes, which are again important in the AJAX toolkit.

Technically, iframes are supported in XHTML Transitional but not XHTML Strict. However, future-proofed support for iframes in XHTML do not exist beyond the DHTML Transitional DTD. There is also something called XFrame that supposedly creates iframe like functionality for XHTML, but XFrame is vaporware.

3. Custom attributes are sinful

In DHTML programming, using custom attributes that aren't in HTML can be a very effective way to create readable and reusable code. For example, imagine if we model a Wiki page with the following markup:

<div id="wiki-page"
lastEdited="Monday 3rd, 2005"
lastEditedBy="Brad Neuberg"
tags="technical ajax important">
<h1 class="title">My Wiki Page</h1>
<span class="author" role="manager">
Brad Neuberg
<span class="content" editableBy="managers">
This is some content
Later, I can grab on to these custom attributes in my JavaScript code just like any other attributes:
var wikiPage = document.getElementById("wiki-page");
var lastEdited = wikiPage.getAttribute("lastEdited");
var lastEditedBy = wikiPage.getAttribute("lastEditedBy");
Yes, I know I can technically model this using the microformat HTML data definition list, but using custom attributes is a much quicker and much more readable way of achieving the same thing.

XHTML frowns on these, since they don't validate. The XHTML cult has a very complicated way to achieve validation while having custom attributes, but why is it worth doing so much work when XHTML is not needed?

4. The XHTML layout engines are slower and buggier

The XHTML layout engines in some browsers is much newer, which means they have more bugs and are generally slower. For example, the XHTML renderer in Safari and Mozilla do not support incremental rendering of the page as it is loading, creating slower perceived performance for end users.

5. Internet Explorer does not support XHTML

The major browser does not support XHTML, requiring one to do some funky stuff in their XHTML to get it to display in IE. For example, you have to put a space after an empty BR tag or else IE breaks on it:
<br />

6. XHTML has to be sent using incorrect MIME types

The correct way of sending XHTML over is using the application/xhtml+xml MIME type, which specifies that we are using a specific XML dialect. Unfortunately, because of IE's non-support of XHTML you have to "lie" to it and other browsers and specify the MIME type as text/html, which technically means that the browser should not treat this as an XML application. This is yet another example of how adopting the XHTML ideology creates problems down the road.

7. The W3C's MITization of the web is not a good thing

There is a rumour on the web that the W3C has finally, after centuries of discussion, discovered how many angels can fit on the head of a pin.

Tim Berner's Lee creation of the original web was genius; it was a good enough, worse is better solution to a pressing problem. Cascading Style Sheets, the Document Object Model, and more were also well done, with a wee bit of cruftiness. XML was an excellent addition to the pantheon of web standards.

In the last few years, however, what has the W3C given us? RDF is a terrible solution to an extremely important problem, namely metadata. RDF is almost as much fun as DSSSL and HyTime (What are those? Exactly), and has crowded better solutions to the metadata problem out of the standards marketplace. XML Schema is equally terrible, a brittle and complicated solution to an important problem, namely clear data types when describing XML languages. The W3C has become as good as the Java Community Process at creating high-quality specs, which is not a compliment.

In the mean time, notice what hasn't come from the W3C: RSS; real world web services with REST rather than XML-RPC and the SOAP superstructure; XmlHttpRequest, created by the pragmatists at Microsoft; and more.

8. XHTML provides no benefits

At the end of the day, the real indictment of XHTML is that it simply provides no real advantages. Yes, in a perfect world, it would be great to have HTML as a full XML application, with support for arbitrarily nested namespaces so that you could do cool stuff like embed SVG and MathML right into your app. However, nested namespaces in XHTML are simply not well supported, and may never be in time for XHTML to work out.

What exactly does XHTML do for you, other than to give you "future-proofness"? I predict that using XHTML in a deep way will actually not future proof your application; instead, as XHTML joins the list of other failed W3C standards, such as RDF, XML Schema, etc., XHTML in your application will become a legacy burden that future programmers will complain about around the water cooler, how the last guy who left several years ago drank the XHTML cool-aid and then left them with the task of working around its burdens.


Well designed, semantic HTML is enough. Perhaps when the WHAT working group comes out with their superior embraced and extended HTML and XHTML standards, XHTML will finally be useful. Currently, the cult of XHTML provides no benefits and is in fact retarding progress. It's time for those on the other side to take a stand.


porneL said…
To answer your post: No, buzzwordy technology #1 is not in buzzwordy relation with buzzwordy technology #2.

I absolutely disagree that document.write is needed for DHTML. I have experience in writing DHTML and W3C DOM run from script at the end of <body> works very well for me. Transformations of existing DOM tree allow much better backwards compatibility.

Custom HTML attributes aren't sinful in XHTML any more than in HTML, but in XHTML you don't have to create custom *HTML* attributes. You can use your own namespace keep XHTML valid (DTD-based W3C validator sucks, ignore it).

IE is harmful to XHTML. That's it.

Mime Type is nothing special. There is HTTP content negotiation for that.

MITization of anti-XHTML arguments. W3C issues are not part of this technology.

XHTML provides very little benefits, I agree. Still far from being harmful.
Simon Pieters said…
#1: What's wrong with the DOM methods?

#2: IFRAME is supported in XHTML (just like you mentioned). XHTML did not drop IFRAME, HTML4 did! IFRAME is illegal in HTML4 Strict. XHTML is an exact reformulation of HTML4 in XML, it does not drop anything and it does not add anything.

If you want to use the Strict flavor, use OBJECT instead of IFRAME.

#3: Non-standard attributes are equally invalid HTML as XHTML. If you want custom attributes to validate as HTML, you still need to create your custom DTD as in XHTML, although that is in fact forbitten by the spec.

#4: They don't support incremental renering, but otherwise I don't see why they are slower or more buggy.

#5: IE doesn't break if you don't have a space before the slash. Netscape 4 does (given it's sent as text/html).

#6: HTML also has to be sent with its correct MIME type, text/html. If you want to support IE, then obviously XHTML is not an option because IE doesn't recognise the namespace. Don't let the MIME type distract you, application/xml has been supported for years.

#7: How is this related to XHTML?

#8: XHTML has the benefit of being XML. If that doesn't bring any benefits to you, then you don't need to use it.

I don't really see how your points make XHTML harmful. Alpha transparency in PNG is also not supported by IE. Is PNG harmful?
Kevin Marks said…
Regarding custom attributes vs a microformat using definition lists, a good test is to consider if the attributes could add useful visible information - if so, using a definition list is a way for that information to do double duty as human readable and machine parseable.
Robert Sayre said…
XHTML can be pretty handy if your app is hooked up to an XML parser for other reasons.
plaus said…
Web pages with compliant XHTML or HTML would be great, but for one thing: humans. Humans aren't perfect, known to make errors from time to time.

Plus, web clients are written to play nicely non-compliant HTML/XHTML. Can you imagine web browsers dealing with XHTML the way XML parsers do? hehehe....

at og.apache.xerces.dom.ParentNode.nodeListItem(Unknown Source)
at org.apache.xerces.dom.ParentNode.item(Unknown Source)
at ...

XHTML seems to be a means to an end - getting HTML ready for XML parsers, in order to extract data from a document. Not the same purpose as any jane or joe wanting to put a web page up without having to learn how to dot every 'i' in the specification.

For the rest of us, there's street HTML, and that ain't going away any time soon.
Argatxa said…
XHTML is great if you do dynamic content static pages that can read by humans and processed by a third system but.. for the average Web development it is too much of a constrain.

Besides.. If I need to connect systems and the only way is through http I put a web service or set a dynamic page that serves pure XML. Easy.

La la la land on one for all is too much work.
Einhverfr said…
Ok, I am not an XHTML fanatic. In fact I generally think that XML is a poor choice for most environments. I prefer LaTeX to Docbook, etc.

However, XHTML does have one major advantage which you don't mention. Because it is XML, you can perform transformations on it which would be cumbersome or impossible on HTML. Hence you could take XHTML and convert it to LaTeX by extending the Jade tools if you want.

This being said, the general rule I expound is: Without the need for transformation, XML is the wrong tool for the job!