XHTML/Media Types

Media Types

edit

XHTML as XHTML

edit

The recommend media type for XHTML is application/xhtml+xml; ideally, this would serve XHTML as XHTML. As of 31 December 2006, only Gecko and Presto based layout engines (Firefox and Opera respectively) support this media type. To have your XHTML read as XHTML, save your markup with an .xht file extension.

Best Practices for serving XHTML as XHTML

edit

In addition to conformation to all of the rules of XML, we strongly recommend that you adhere to the following practices when serving XHTML as XHTML:

  • An XML stylesheet PI (processing instruction) is recommended for associating external stylesheets:
Rather than this:
<link rel="stylesheet" type="text/css" href="/mystyle.css">
XHTML as XHTML should include this:
<?xml-stylesheet type="text/css" href="/mystyle.css"?>
<?xml-stylesheet type="text/xsl" href="/mystyle.xsl"?>

XHTML as XML

edit

A secondary media type for XHTML is the generic XML media type, application/xml. This media type is supported by most layout engines, including Gecko, Presto, Web Core, KHTML, and even Trident, though only two—Gecko and Presto—serve it as XHTML; the rest serve it as XML, which could result in several quirks. That said, this is the media type of choice for cross-compatibility. Serving XHTML as XML has the benefit of XML parsing, meaning that, unlike text/html which serves XHTML as HTML, there will be no conflicts between XML syntax and XHTML syntax, since they are both the same. To have your XHTML read as XML, save your markup with an .xml file extension.

Best Practices for serving XHTML as XML

edit

In addition to conformation to all of the rules of XML, we strongly recommend that you adhere to the following practices when serving XHTML as XML:

  • We encourage all authors to set a charset parameter through a higher level protocol (this can be done with a scripting language like PHP). If you are unable to go that route (and even if you are), you should define your character encoding via XML declaration:
Rather than this:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
XHTML as XML should include this:
<?xml version="1.0" encoding="utf-8"?>

XHTML as HTML

edit

A third option is serving XHTML as HTML via text/html. This essentially defeats most purposes of programming with XHTML, since HTML is neither subjected to XML parsing nor extensible. Further, the only way to write valid XHTML as HTML is to adhere to the rules of both languages—even when those rules conflict with each other (such as those pertaining to empty elements). Finally, HTML rules apply for DOM scripting and stylesheets.

Before serving XHTML as HTML, we encourage you to examine the purposes of doing so. Some questions worth asking might include the following: Do I want to take advantage of the extensibility of XML? Do I want to my markup to be parsed as XML? Am I going to take advantage of XSL stylesheets? If the answer to all of those questions is no, then HTML 4.01 may be the way to go. However, if you have your heart set on serving XHTML as HTML, XHTML 1.0 Transitional does technically allow for it. To have your XHTML read as HTML, save your markup with an .htm file extension and be sure your DTD is XHTML 1.0 Transitional.

Content Negotiation

edit

For the purposes of XHTML, content negotiation is the practice of serving one media type to certain layout engines, while serving another to others via embedded scripting languages (Javascript, PHP). The difference between using content negotiation and simply serving XHTML with a text/html media type is that content negotiation often (though not necessarily) serves HTML as HTML as opposed to XHTML as HTML. In other words, depending on the UA (user agent), a peek at the source code would reveal either an HTML DTD or XHTML DTD, an XML declaration or a Content-type meta tag, an XML stylesheet PI or a linked one. This means that with judicious scripting, one could author valid XHTML without regard to the rules of HTML and take advantage of the extensibility and efficient processing of XML for supporting layout engines while authoring valid HTML (sans aforementioned extensibility, processing) for other layout engines using much of the same markup.

Media Types and Layout Engines

edit

Comparison of Media Types and Layout Engines

edit

The responses when a well-formatted XHTML document is served with different media types.

Trident Tasman Gecko WebCore KHTML Presto iCab
application/xhtml+xml Prompt for download Prompt for download XHTML XML HTML XHTML (X)HTML
application/xml XML Crash XHTML XML XML XHTML Text
text/html HTML HTML HTML HTML HTML HTML HTML

Presto media type notes

edit
  1. application/xml — In order to receive the benefits of XML processing in Presto-based layout engines (such as Opera or Opera Mini or Opera Mobile), one must use an XSL stylesheet to transform the XHTML into XML. [1].

WebCore media type notes

edit
  1. application/xhtml+xml, application/xml, text/xml — HTML entities and custom entities defined by custom DTD are not recognized.

KHTML media type notes

edit
  1. application/xhtml+xml — KHTML supports this media type, but processes the document as HTML.
  2. application/xml, text/xml — HTML entities and custom entities defined by custom DTD are not recognized.

iCab media type notes

edit
  1. application/xhtml+xml — Type selector in CSS is matched case-insensitively.

XML Parsing

edit

Save two copies of the following document, one as badlyFormed.html and one as badlyFormed.xhtml.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
  <title>Not well-formed</title>
 </head>
 <body>
  <p>An XHTML-compliant browser should refuse to render any part of this page.</p>
  <p>This paragraph is <b><i>not</b> well-formed</i>.
 </body>
</html>

According to the XML specification, on which the XHTML specification is based, a compliant browser should refuse to render any part of the document. Instead it should return an error message. Open badlyFormed.xhtml. You should get an error message similar to this one:

XML Parsing Error: mismatched tag. Expected: </i>.
Location: file:///D:/Practice/XHTML/BadlyFormed.xhtml
Line Number 8, Column 35:

  <p>This paragraph is <b><i>not</b> well-formed</i>.
----------------------------------^

If you don't get an error message then either the MIME type is set incorrectly or your browser is non-compliant. Microsoft Internet Explorer, version 7.0 and lower, is non-compliant. Mozilla Firefox is compliant (at least from version 1.0 upwards, possibly in earlier versions).

Using Firefox you can check the MIME type for the page. From the Tools menu select the Page Info option. Three lines down on the General tab is 'Type:'. The type for badlyFormed.xhtml should be application/xhtml+xml. If it isn't you don't have the extension .xhtml mapped to application/xhtml+xml.

Now try opening badlyFormed.html with Firefox. The page will probably be displayed, contrary to the XHTML specification. Check the MIME type. It should be text/html. This means that Firefox parses the document as HTML not XHTML. Web browsers normally display HTML regardless of any errors on the page, so the page is rendered.