HTML5 Defined! It's Not Just a Marketing Term

There's been a fair bit of discussion recently about what some folks mean when they say HTML5. I use HTML5 in the wider sense, so it's only right that I take a stab at defining what I mean when I say something is part of HTML5.

From a very high level, when I say HTML5 I mean:

"Everything that is in the formal W3C HTML5 spec; everything that used to be in there but was broken out for various reasons; sibling and related technologies and developments like CSS3, SVG, EcmaScript 5, etc.; and experimental explorations that are pushing the boundaries."

I won't go into the reasons why I use HTML5 in these more expansive terms, as I blogged about that recently.

Going deeper, I've broken these down into separate areas:

  • "HTML5 Strict" - Things that are strictly inside the W3C's HTML5 spec.
  • "Referenced by HTML5" - Things that are referenced by the HTML5 spec and which can optionally be parsed into the DOM and displayed.
  • "Broken out of HTML5" - Things that used to be part of HTML5 or its older iterations, called Web Applications and Web Forms.
  • "HTML5 Family of Technologies" - Extended set of technologies not strictly part of HTML5 spec or referenced but likely to be used in conjunction with HTML5.
  • "HTML5++" - More experimental technologies pushing the web forward that are not part of the HTML5 spec at all; may or may not see broader adoption.

One small note; there are actually two HTML5 specs, one maintained by the W3C and the other maintained by the WhatWG.

You need to understand that HTML5 began as a revolution to the established order, initiated by the WhatWG. A peace of sorts developed over the years, with the upstart "Web Applications" and "Web Forms" specs brought in-house to the W3C under the moniker HTML5. Over time I'm assuming that the W3C spec, when Final Call has happened, will be the canonical spec.

To simplify things below, I'm only referencing the W3C HTML5 spec for now. Here's how I would break things down based on what I said above; if you think something should be somewhere else or things get moved around email me and I'll update this (Last Updated: June 14th, 2010). If you want to know the state of where these technologies are implemented see; if you want your code to detect what is available see Mark Pilgrim's book for details.

"HTML5 Strict": Strictly Inside the W3C's HTML5 Spec
  • HTML5 Doctype: <!doctype html>
  • HTML5 parsing
  • XHTML5 serialization
  • Cleaning up edge cases of existing web content for greater compatibility
  • New semantic, behavior, and application tags: section, nav, article, aside, hgroup, header, footer, address, figure, figcaption, time, code, var, samp, kbd, output, progress, meter, details, summary, command, menu, keygen
  • Being able to nest H1, H2, etc. arbitrarily
  • Sandbox attribute on iframes
  • Video tag, API and events
  • Audio tag, API, and events
  • New form input types: telephone, search, url, e-mail, date, time, month, week, number, range, color
  • New form abilities: multiple file upload; placeholder text; directing focus on initial page load; constraint validation by input type and properties
  • New link rel types: alternate, archives, author, bookmark, external, help, icon, license, nofollow, noreferrer, pingback, prefetch, search, sidebar, tag, index, up, first, last, next, prev
  • data-* attributes on elements to be used by JavaScript
  • Offline Web applications
  • contenteditable for editing
  • Drag and Drop
  • UndoManager for consistent undos
  • Parsing empty and unknown tags into the DOM: <foobar />
  • async attribute on SCRIPT tags
  • PUT and DELETE methods for form submission
  • Deprecated elements: acronym, applet, basefont, big, center, dir, font, frame, frameset, isindex, noframes, s, strike, tt, u
  • getElementsByClassName
  • innerHTML, outerHTML, insertAdjacentHTML

"Referenced by HTML5": Referenced from W3C HTML5 spec, including how to parse into an HTML5 DOM; HTML5 parsing engines can optionally include these in DOM and display them
  • MathML
  • SVG

"Broken Out of HTML5": Used to be inside of HTML5, Web Applications, or Web Forms specifications
  • Web Sockets
  • Local Persistent Storage (localStorage and sessionStorage)
  • SQL Storage (in contention versus IndexDB)
  • DataGrid
  • Specific HTML5 Video codec: H.264, Ogg/Theora, WebM (contention between video codecs)
  • Specific HTML5 Audio codec
  • Device element
  • Ping attribute
  • Timed track model for media elements
  • Canvas
  • Microdata and Microdata Vocabularies (some level of contention versus RDFa and Microformats)
  • Cross-document messaging
  • Channel messaging
  • W3C XMLHttpRequest specification
  • Server-Sent Events
  • Ajax Session History
  • MIME type and Protocol handler registration
  • P2P connections

"HTML5 Family of Technologies": Extended set of technologies not strictly part of HTML5 spec or referenced but likely to be used in conjunction with HTML5
  • CSS3
    • Flex Box Layout
    • Multi-Column Layout
    • Animations
    • Transforms (2D and 3D)
    • Transitions
    • Masking and Effects (rounded corners, shadows, etc.)
    • Gradients
    • CSS3 Selectors
    • Media Queries
  • Web Fonts - CSS 2.1 @font-face + OpenType/WOFF (slight contention for OpenType vs. WOFF)
  • W3C Geolocation
  • Metadata - RDFa, Microformats (Some level of contention vs. Microdata)
  • Web workers
  • ARIA
  • EcmaScript 5
  • Faster JavaScript
  • CSS styling of new HTML5 input types (color, range, etc.)
  • IndexDB (in contention versus SQL Storage)
  • querySelector/querySelectorAll
  • GPU acceleration of HTML, Canvas, SVG, and CSS3 Animations/Transitions/Transforms

"HTML5++": More experimental technologies pushing the web forward; may or may not see broader adoption
  • WebGL
  • O3D
  • Firefox Audio APIs
  • XBL 2.0


pkeane said…
Very nicely done & appreciated. Clarity is goog for everyone & I honestly think this is exactly what has been needed (and I sure would love to see this stay updated somewhere, maybe on a Google blog?). Anyway, I'll certainly be bookmarking it.
Sventovit said…
Oh oh... what's wrong with the links?
Each one seems to drive to a blogger account login.
Anyway this is a great article.

I'm gonna make an echo of it on my html5 blog :
Anonymous said…
I've been struggling with this in conversations lately too. This is a great reference to point to, so I can say "When I say HTML5, I'm referring to the HTML5 family of technologies."
A few nits.

getElementsByClassName is still part of the official HTML5 spec at W3C:

_data attribute are supposed to be used by scripts, but they are not normally put on the script tags. Indeed, they can be put on any (start) tag.
Anonymous said…
Note that getElementsByClassName actually is part of "HTML5 strict".
flowney said…
There is also the ePub standard. A distant relative to be sure but there are those in the International Digital Publishing Forum (IDPF) who see it moving toward HTML 5 as a way to include richer media and greater interactivity.

What do you think of that?