Wednesday, October 8th, 2008

The myths and reality of XHTML

Category: W3C

Lack of support for XHTML is a fact of life on the web in 2008. Prior to the 3.0 series of Firefox the XHTML processor in Gecko was so poor that Mozilla’s own engineers recommended against it[27]; no version of Internet Explorer up to, and including, IE 8 support XHTML at all, and a number of other browsers such as Lynx were never written to handle XML in the first place.

The above quote comes from XHTML?—?myths and reality by Tina Holmboe, a member of the XHTML Working Group.

It is a nice piece that delves into:

  1. Introduction
  2. The Purpose of XHTML
  3. XHTML and the Content Type
  4. Strictly XHTML
  5. Lack of support
  6. Content–Negotiation
  7. Recommendations

Posted by Dion Almaer at 6:07 am

3.1 rating from 23 votes


Comments feed TrackBack URI

Although it gives a nice historical summary, that article is overly simplistic and doesn’t cover the full extent of the HTML and XHTML problem. People are better off reading: and

Comment by marcosc — October 8, 2008

What is this, somebody’s college dissertation? She writes this as if it’s an amazing new discovery and yet it’s all been known about save by the most clueless web developers (of which there are many) for years now.

Comment by jbot — October 8, 2008

I’m not sure such an article is really useful. Granted, all these problems exist and should be known, but such articles tend to lead to a panic. XHTML is the more versatile standard and also much easier to implement, while HTML places a major burden on the client to interpret the data (ever tried to parse a HTML news thread?). Also, it’s very easy to trigger an unintended behavior in HTML (just look at the number of HTML pages containing &nbsp without ; because this is sometimes valid).

Comment by Hans Schmucker — October 8, 2008

I’ve been fighting with IE for 11 years and it wasn’t until we started developing all our sites as xhtml 1.0 strict almost a year ago that my life got better. Entire classes of IE bugs evaporated overnight or us. It’s probably just a doctype behavior switch but as far as I’m concerned, that’s all the xhtml support I need.

Comment by tack — October 8, 2008

I literally cry when I see the mess HTML is compared to strict XML documents. And the current movement away from XHTML back to HTML. If the browser was invented now they would all parse XML and not some semi SGML/XML tag soup.

It’s a shame that XHTML never really catched on. I don’t know why maybe lazy web developers that didn’t understand why the would would be a better place if all documents where strict XML or poor marketing of the concept from W3C.

There would be no need then to build specific HTML parser libraries like html5lib and so forth just two lines of code in most languages that have an XML parser in their API.

Comment by Spocke — October 8, 2008

It never catched on because it’s not supported, building xml or html parsers and rendering the data into a document isn’t a walk in the park, if it was xhtml would be supported. Build your own browser if you disagree (the code’s open sourced go on do it and prove me wrong!)

xhtml is a failed technology. A hoorah from when everyone thought xml was the silverbullet data format of the future and got shaken off the wagon when JSON ran into it. If XHTML wasn’t a failed technology there wouldn’t be an HTML5 spec.

Comment by someguynameddylan — October 8, 2008

Actually, HTML defined a set of features that can be used in HTML or XHTML… the name HTML5 is really misleading.

Comment by Hans Schmucker — October 8, 2008

XHTML and the mashup model just don’t rhyme. When you’re pulling in content from all over the place, as most large sites do, there’s no way to guarantee there won’t be some error in there somewhere that causes your site to crash and burn other than doing tag soup parsing and correction on the server at the moment you serve each page, which is completely pointless, because then you might as well do it on the client.

XHTML’s absolute requirement for strictness is also the reason why it has no real world viability.

I guarantee that if by magic you could force all browsers to only interpret strict XHTML, content authors would not improve their ways, but server tools would merely adapt by doing the tag soup correction that the browser currently does.

Comment by Joeri — October 9, 2008

Yes, even if there isn’t true XML parsing going on unless you set the correct http headers. I think it’s still a good idea to go toward the more strict standard of XML. And if everyone this that we wouldn’t need to have the the parentheses in (X)HTML5. The mashup problem could be avoided if there where like a more flexible iframe thing like some widget element that could contain most element types but still be affected by the page CSS. But I guess the race is lost for XHTML for general public use it’s a shame and the web will continue to be a messy place with attributes without quotes, single element with out short endings, single attributes without value and so forth. Horrible. :(

Comment by Spocke — October 9, 2008

@tack – I agree. Complete browser support or not, tag soup or not, XHTML in my toolchain has proven invaluable for browser consistency, elimination of browser weirdness due to differences in parsing invalid HTML, etc. Furthermore it has encouraged me to think about correctness and producing verifiably correct XHTML under all circumstances, and as a result I have been ahead of the curve with respect to things like XSS and CSRF.

I believe that a problem equal to the lack and inconsistency of browser support for XHTML is the inadequacy of the tools to produce XHTML in the first place. While there are plenty of JavaXML binding tools, parsers, etc., I have found nothing that matches the code readability and convenience of producing (X)HTML from e.g. a .jsp. What is needed is a strongly typed, declarative, streamable XML binding language.

Comment by JonathanLeech — October 9, 2008

While I agree that it would be preferable to have xhtml support, I don’t see what the real problem is with using html instead. Everybody who is screaming about how xhtml is better can create perfectly valid html documents. The fact is: if you are capable of creating perfect xhtml documents, then you can also create perfect html documents. Who cares if the browser is going to parse it as tag soup?

I think developers are just plain lazy. Just because the browser will let you send it broken html doesn’t make it the browser’s fault that you’ve decided to send broken html “because it lets me”. So long as you send valid html, there is no real difference between using that and xhtml.

Basically, use html, but always output your documents in strict html format as though you WERE using xhtml. It’s actually quite easy if you actually care enough to do it. Please stop blaming the browsers for your lack of motivation to do things the Right Way.

Comment by nate — October 9, 2008

If there was more of a push to make HTML as sctrictly coded as XHTML, this wouldn’t be an issue. I use 4.01 strict for all my sites, yet if my code had trailing slashes where necassary, it would all validate as XHTML. The argument here isn’t over technological standards, its over poor coding practices. If browsers required more strictly typed HTML, everyones argument for XHTML would fall through the floor.

Comment by tj111 — October 10, 2008

Its not just about writing cleaner code, its interoperability. Its the ability to customize, extend, abstract, mash up, exchange and reference content between/across domains and organizations. Until HTML can support that in a commonly accepted manner, I don’t see XML going anywhere.
Also you can’t consider JSON a replacement for XML since it does not have a de jure nor de facto standard for schema/semantic definition. Oftentimes the order of elements matter, and presently JSON has no practical way of representing structures such as these when it involves mixed content..

Comment by TNO — October 10, 2008

Forgot to mention, ES 3.1 may be the answer to many JSON issues, assuming the standard is updated.

Comment by TNO — October 10, 2008

Leave a comment

You must be logged in to post a comment.