HTML5 Tutorial – Changes in HTML4 to HTML5

HTML5_Badge_256“The HTML syntax”, that is mostly compatible with HTML4 and XHTML1 documents published on the Web

HTML5 brings you 30 new elements to mark up documents and applications, and some tags are obsolete.

There was a concerted effort to separate the idea of what the element tag was for from its presentation. Some elements were removed, some had their meaning redefined, and some are new.

This post shows you what tags you can use and best practices in their use.

Doctype

The first time you look at an HTML document, you will note a new DocType. The HTML5 specification removed the older, lengthier (and confusing to me anyway) Doctypes.

The doctype declaration for the HTML syntax is <!DOCTYPE html> and is case-insensitive.

In the XML syntax, any doctype declaration may be used, or it may be omitted altogether. Documents with an XML media type are always handled in standards mode.

When using the HTML Doctype, you no longer need to close certain elements. For example,

Documents using the HTML syntax are served with the text/html media type.

XML is OK Still

The other syntax that can be used for HTML is XML. This syntax is compatible with XHTML1 documents and implementations. Documents using this syntax need to be served with an XML media type (such as application/xhtml+xml or application/xml) and elements need to be put in the http://www.w3.org/1999/xhtml namespace. For example:

Elements That Are Gone

Many elements are no longer part of HTML5.

Purely presentational elements such as center, font and big are now obsolete. You can use Cascading Style Sheets to provide those presentations. Gone are:

Presentational attributes have been removed from current elements; for example, align on img, table, background on body, and bgcolor on table.

The frame element is absent in HTML5. Frames caused usability and accessibility issues. So gone are:

Some gone because they were not used, created confusion, or their function can be handled by other elements:

  • acronym is not included because it has created a lot of confusion. Web developers are to use abbr for abbreviations.
  • applet has been obsoleted in favor of object.
  • isindex usage can be replaced by usage of form controls.
  • dir has been obsoleted in favor of ul.

New Elements

This section lists the elements that are added to HTML5. These are supported in all modern browsers. Check CanIUse.com semantic elements.

In HTML5 Tutorial – Getting Started With Semantic Tags, you learned about several of the new elements.  In particular:

  • section represents a generic document or application section. It can be used together with the h1, h2, h3, h4, h5, and h6 elements to indicate the document structure.
  • article represents an independent piece of content of a document, such as a blog entry or newspaper article.
  • main can be used as a container for the dominant contents of another element, such as the main content of the page. In W3C HTML5 and W3C HTML 5.1, only one such element is allowed in a document.  (Not supported in IE)
  • aside represents a piece of content that is only slightly related to the rest of the page.
  • hgroup represents the header of a section, and allows you to group together a set of headers.

  • header represents a group of introductory or navigational aids.
  • footer represents a footer for a section and can contain information about the author, copyright information, etc.
  • nav represents a section of the document intended for navigation.

The following elements have been introduced for better structure:

  • figure represents a piece of self-contained flow content, typically referenced as a single unit from the main flow of the document. figcaption can be used as caption (it is optional).

Then there are several other new elements:

  • video and audio for multimedia content. You can learn more about these elements in a later posts.
  • track provides text tracks for the video element.
  • embed is used for plugin content.
  • mark represents a run of text in one document marked or highlighted for reference purposes, due to its relevance in another context.
  • progress represents a completion of a task, such as downloading or when performing a series of expensive operations.
  • meter represents a measurement, such as disk usage.
  • time represents a date and/or time.
  • WHATWG HTML and W3C HTML5.1 has data, which allows content to be annotated with a machine-readable value.
  • dialog for showing a dialog.
  • ruby, rt, and rp allow for marking up ruby annotations.
  • bdi represents a span of text that is to be isolated from its surroundings for the purposes of bidirectional text formatting.
  • wbr represents a line break opportunity.
  • canvas is used for rendering dynamic bitmap graphics on the fly, such as graphs or games.
  • menuitem represents a command the user can invoke from a popup menu.
  • details represents additional information or controls which the user can obtain on demand. The summary element provides its summary, legend, or caption.
  • datalist together with the a new list attribute for input can be used to make combo boxes (not yet supported in IE nor Safari):
    <input list="browsers">
    <datalist id="browsers">
     <option value="Safari">
     <option value="Internet Explorer">
     <option value="Opera">
     <option value="Firefox">
    </datalist>
  • keygen represents control for key pair generation.
  • output represents some type of output, such as from a calculation done through scripting.

New Input Types

The input element’s type attribute now has the following new values:

The idea of these new types is that the user agent can provide the user interface, such as a calendar date picker or integration with the user’s address book, and submit a defined format to the server. It gives the user a better experience as his input is checked before sending it to the server meaning there is less time to wait for feedback.

SVG and MathML

In later posts we’ll talk a lot about SVG and MathML. But in this post, we should explain that MathML and SVG elements can be used inside a document. An math or svg start tag causes the HTML parser to switch.

Little Changes

You can finally use an ampersand without escaping. In HTML4 you had to write &amp; In HTML5 you can write &

Attributes have to be separated by at least one whitespace character.

You no longer are required to have quote marks for as many attributes.

References

There are many more changes.

Differences from HTML4