HyperText Markup Language/Text Formatting

The Text Formatting elements give logical structure to phrases in your HTML document. This structure is normally presented to the user by changing the appearance of the text.

We have seen in the Introduction to this book how we can emphasize text by using <em></em> tags. Graphical browsers normally present emphasized text in italics. Some Screen readers, utilities which read the page to the user, may speak emphasized words with a different inflection.

A common mistake is to tag an element to get a certain appearance instead of tagging its meaning. This issue becomes clearer when testing in multiple browsers, especially with graphical and text-only browsers as well as screen readers.

You can change the default presentation for any element using Cascading Style Sheets. For example, if you wanted all emphasized text to appear in red normal text you would use the following CSS rule:

em { font-style:normal; color:red; }

In this section, we will explore a few basic ways in which you can markup the logical structure of your document.

Emphasis edit

HTML has elements for two degrees of emphasis:

  • The em element for emphasized text, usually rendered in italics.
  • The strong element for strongly emphasized text, usually rendered in bold.

An example of emphasized text:

It is essential not only to guess but actually <em>observe</em> the results.

An example rendering:

It is essential not only to guess but actually observe the results.

An example of strongly emphasized text:

Let us now focus on <strong>structural markup</strong>.

An example rendering:

Let us now focus on structural markup.

Preformatted text edit

Preformatted text is rendered using fixed-width font, and without condensing multiple spaces into one, which results in preserved spacing. Newlines are rendered as newlines, unlike outside preformatted text. HTML markup in the preformatted text is still interpreted by browsers though, meaning that "<em>a</em>" will still be rendered as "a".

To create preformatted text, start it with <pre> and end it with </pre>.

An example:

<pre>
,-----------------------,
| No. | Person          |
|-----------------------|
| 1.  | Bill Newton     |
| 2.  | Magaret Clapton |
'-----------------------'
</pre>

The resulting rendering:

,-----------------------,
| No. | Person          |
|-----------------------|
| 1.  | Bill Newton     |
| 2.  | Magaret Clapton |
'-----------------------'

Omitting the preformatting tags will cause the same text to appear all in one line:

,-----------------------, | No. | Person | |-----------------------| | 1. | Bill Newton | | 2. | Magaret Clapton | '-----------------------'

Special Characters edit

To insert non-standard characters or characters that hold special meaning in HTML, a character reference is required. For example, to input the ampersand, "&", "&amp;" must be typed. Characters can also be inserted by their ASCII or Unicode number code.

Name Code Number Code Glyph Description
&acute; &#180; ´ acute accent
&amp; &#38; & ampersand
&bdquo; double low-9 quote
&brvbar; or &brkbar; &#166; ¦ broken vertical bar
&cedil; &#184; ¸ cedilla
&cent; &#162; ¢ cent sign
&clubs; black club suit
&copy; &#169; © copyright
&curren; &#164; ¤ general currency sign
&dagger; dagger
&Dagger; double dagger
&darr; downward arrow
&deg; &#176; ° degree sign
&diams; black diamond suit
&divide; &#247; ÷ division sign
&frac12; &#189; ½ one-half
&frac14; &#188; ¼ one-fourth
&frac34; &#190; ¾ three-fourths
&frasl; &#47; / slash
&gt; &#62; > greater-than sign
&hearts; black heart suit
&iexcl; &#161; ¡ inverted exclamation
&iquest; &#191; ¿ inverted question mark
&laquo; &#171; « left angle quote
&larr; leftward arrow
&ldquo; left double quote
&lsaquo; single left-pointing angle quote
&lsquo; left single quote
&lt; &#60; < less-than sign
&macr; or &hibar; &#175; ¯ macron accent
&mdash; &#151; em dash
&micro; &#181; µ micro sign
&middot; &#183; · middle dot
&nbsp; &#160;   nonbreaking space (invisible)
&ndash; &#150; en dash
&not; &#172; ¬ not sign
&oline; overline, = spacing overscore
&ordf; &#170; ª feminine ordinal
&ordm; &#186; º masculine ordinal
&para; &#182; paragraph sign
&permil; per mill sign
&plusmn; &#177; ± plus or minus
&pound; &#163; £ pound sterling
&quot; &#34; " double quotation mark
&raquo; &#187; » right angle quote
&rarr; rightward arrow
&rdquo; right double quote
&reg; &#174; ® registered trademark
&rsaquo; single right-pointing angle quote
&rsquo; right single quote
&sbquo; single low-9 quote
&sect; &#167; § section sign
&shy; &#173; ­ soft hyphen
&spades; black spade suit
&sup1; &#185; ¹ superscript one
&sup2; &#178; ² superscript two
&sup3; &#179; ³ superscript three
&times; &#215; × multiplication sign
&trade; trademark sign
&uarr; upward arrow
&uml; or &die; &#168; ¨ umlaut
&yen; &#165; ¥ yen sign

Name Code Number Code Glyph Description
&Agrave; &#192; À uppercase A, grave accent
&Aacute; &#193; Á uppercase A, acute accent
&Acirc; &#194; Â uppercase A, circumflex accent
&Atilde; &#195; Ã uppercase A, tilde
&Auml; &#196; Ä uppercase A, umlaut
&Aring; &#197; Å uppercase A, ring
&AElig; &#198; Æ uppercase AE
&Ccedil; &#199; Ç uppercase C, cedilla
&Egrave; &#200; È uppercase E, grave accent
&Eacute; &#201; É uppercase E, acute accent
&Ecirc; &#202; Ê uppercase E, circumflex accent
&Euml; &#203; Ë uppercase E, umlaut
&Igrave; &#204; Ì uppercase I, grave accent
&Iacute; &#205; Í uppercase I, acute accent
&Icirc; &#206; Î uppercase I, circumflex accent
&Iuml; &#207; Ï uppercase I, umlaut
&ETH; &#208; Ð uppercase Eth, Icelandic
&Ntilde; &#209; Ñ uppercase N, tilde
&Ograve; &#210; Ò uppercase O, grave accent
&Oacute; &#211; Ó uppercase O, acute accent
&Ocirc; &#212; Ô uppercase O, circumflex accent
&Otilde; &#213; Õ uppercase O, tilde
&Ouml; &#214; Ö uppercase O, umlaut
&Oslash; &#216; Ø uppercase O, slash
&Ugrave; &#217; Ù uppercase U, grave accent
&Uacute; &#218; Ú uppercase U, acute accent
&Ucirc; &#219; Û uppercase U, circumflex accent
&Uuml; &#220; Ü uppercase U, umlaut
&Yacute; &#221; Ý uppercase Y, acute accent
&THORN; &#222; Þ uppercase THORN, Icelandic
&szlig; &#223; ß lowercase sharps, German
&agrave; &#224; à lowercase a, grave accent
&aacute; &#225; á lowercase a, acute accent
&acirc; &#226; â lowercase a, circumflex accent
&atilde; &#227; ã lowercase a, tilde
&auml; &#228; ä lowercase a, umlaut
&aring; &#229; å lowercase a, ring
&aelig; &#230; æ lowercase ae
&ccedil; &#231; ç lowercase c, cedilla
&egrave; &#232; è lowercase e, grave accent
&eacute; &#233; é lowercase e, acute accent
&ecirc; &#234; ê lowercase e, circumflex accent
&euml; &#235; ë lowercase e, umlaut
&igrave; &#236; ì lowercase i, grave accent
&iacute; &#237; í lowercase i, acute accent
&icirc; &#238; î lowercase i, circumflex accent
&iuml; &#239; ï lowercase i, umlaut
&eth; &#240; ð lowercase eth, Icelandic
&ntilde; &#241; ñ lowercase n, tilde
&ograve; &#242; ò lowercase o, grave accent
&oacute; &#243; ó lowercase o, acute accent
&ocirc; &#244; ô lowercase o, circumflex accent
&otilde; &#245; õ lowercase o, tilde
&ouml; &#246; ö lowercase o, umlaut
&oslash; &#248; ø lowercase o, slash
&ugrave; &#249; ù lowercase u, grave accent
&uacute; &#250; ú lowercase u, acute accent
&ucirc; &#251; û lowercase u, circumflex accent
&uuml; &#252; ü lowercase u, umlaut
&yacute; &#253; ý lowercase y, acute accent
&thorn; &#254; þ lowercase thorn, Icelandic
&yuml; &#255; ÿ lowercase y, umlaut

Abbreviations edit

Another useful element is abbr. This can be used to provide a definition for an abbreviation, e.g.

 <abbr title="HyperText Markup Language">HTML</abbr>
Will be displayed as: HTML
When you will hover over HTML, you see HyperText Markup Language


Graphical browsers often show abbreviations with a dotted underline. The title appears as a tooltip. Screen readers may read the title at the user's request.

Note: very old browsers (Internet Explorer version 6 and lower) do not support abbr. Because they support the related element acronym, that element has been commonly used for all abbreviations.

An acronym is a special abbreviation in which letters from several words are pronounced to form a new word (e.g. radar - Radio Detection And Ranging). The letters in HTML are pronounced separately, technically making it a different sort of abbreviation known as an initialism.

Discouraged Formatting edit

HTML supports various formatting elements whose use is discouraged in favor of the use of cascading style sheets (CSS). Here's a short overview of the discouraged formatting, so that you know what it is when you see it in some web page, and know how to replace it with CSS formatting. Some of the discouraged elements are merely discouraged, others are deprecated in addition.

Element Effect Example Status CSS Alternative
b boldface bold font-weight: bold;
i italics italics font-style: italic;
u underlined underlined deprecated text-decoration: underline;
tt typewriter face typewriter face font-family: monospace;
s strikethrough strikethrough deprecated text-decoration: line-through;
strikethrough strikethrough <strikethrough>strikethrough</strikethrough> deprecated text-decoration: line-through;
big big font big font-size: larger;
small small font small font-size: smaller;
font font size size=1 deprecated font-size:(value)
center center a block deprecated text-align: center;

Cascading Style Sheets edit

The use of style elements such as <b> for bold or <i> for italic is straight-forward, but it couples the presentation layer with the content layer. By using Cascading Style Sheets, the HTML author can decouple these two distinctly different parts so that a properly marked-up document may be rendered in various ways while the document itself remains unchanged. For example, if the publisher would like to change cited references in a document to appear as bold text as they were previously italic, they simply need to update the style sheet and not go through each document changing <b> to <i> and vice-versa. Cascading Style Sheets also allow the reader to make these choices, overriding those of the publisher.

Continuing with the above example, let's say that the publisher has correctly marked up all their documents by surround references to cited material (such as the name of a book) in the documents with the <cite> tag:

<cite>The Great Gatsby</cite>

Then to make all cited references bold, one would put something like the following in the style sheet:

 cite { font-weight: bold; }

Later someone tells you that references really need to be italic. Before CSS, you would have to hunt through all your documents, changing the <b> and </b> to <i> and </i> (but being careful *not* to change words that are in bold that are not cited references).

But with CSS, it's as simple as changing one line in the style sheet to

 cite { font-style: italic; }

Bibliography edit