Last modified on 26 March 2015, at 21:19

HyperText Markup Language/Introduction

HTML is a markup language used in most of the pages of the World Wide Web. HTML files are text files that, unlike completely plain text, contain additional formatting markup—sequences of characters telling web browsers what parts of text should be bold, where the headings are, or where tables, table rows and table cells start and end. HTML may be displayed by a visual web browser, a browser that reads the text of the page to the user, a braille reader that converts pages to a braille format, an email client, or a wireless device like a cellular phone.

Before we startEdit

To author and test HTML pages, you will need an editor and a web browser. HTML can be edited in plain text editors, including those that highlight HTML markup with colors to make it easier to read. There are also WYSIWYG (what you see is what you get) editors of HTML, and complex WYSIWYG editors with website project management and development environments.

(It is also a good idea to learn XHTML, Javascript, ASP.NET and/or PHP to make high quality websites, though.)

Plain text editors include Notepad (or Notepad++) for Microsoft® Windows, TextEdit for Mac, or Vim, Emacs and others for Linux.

Commercial HTML editors include Adobe Contribute CS5 and Dreamweaver CS5 (both Win/Mac), and Microsoft's Visual Web Developer (Win). There are also free HTML editors out there including Evrsoft First Page (Win), Mozilla KompoZer (Win/Mac/Lin) and Quanta Plus (Lin). However, it is usually better to gain a basic knowledge of HTML using a code-based HTML editor before delving into the WYSIWYG editors (all of the previous in this paragraph).

To preview your documents, you'll need a web browser. To make your documents look good to the greatest number of readers, test the documents in several browsers. Each browser has slightly different rendering, and most have their quirks, resulting in certain sequences of correctly written HTML rendered incorrectly.

Microsoft Internet Explorer is the most widely used browser, as of November 2010 having a 46% market share. Other common browsers include Google Chrome, Mozilla Firefox, Safari, and Opera. To make sure that your documents are readable in a text only environment, you can use Lynx.

A simple documentEdit

Let's start with a simple document. Write this code in your editor (or copy-and-paste it) and save it as "index.html" or "index.htm". The file must be saved with the exact extension, or it will not be rendered correctly.

  <title> Simple document </title>
  <p>The text of the document goes here.</p>

Now open the document in your browser and look at the result. From the above example, we can deduce certain essentials of an HTML document:

  • The first line of a valid HTML document must state which version of HTML the document uses. This example uses the strict variant of HTML version 4.01. Note that in HTML5 this is simplified to <!DOCTYPE html>.
  • The HTML document begins with a <html> tag and ends with its counterpart, the </html> tag.
  • Within the <html></html> tags, there are two main pairs of tags, <head></head> and <body></body>.
  • Within the <head></head> tags, there are the <title></title> tags which enclose the textual title to be shown in the title bar of the web browser.
  • Within the <body></body> is a paragraph marked by a <p></p> tag pair.
  • Most tags must be written in pairs between which the effects of the tag will be applied.
    • <em>This text is emphasized</em> – This text is emphasized
    • This text includes <code>computer code</code> – This text includes computer code
    • <em>This text is emphasized and has <code>computer code</code></em> – This text is emphasized and has computer code
  • HTML tag pairs must be aligned to encapsulate other tag pairs, for example:
    • <code><em>This text is both code and emphasized</em></code> – This text is both code and emphasized
    • A mistake: <em><code>This markup is erroneous</em></code>

The <html> TagEdit

The <html> and </html> tags are used to mark the beginning and end of an HTML document. This tag does not have any effect on the appearance of the document.
This tag is used to make browsers and other programs know that this is an HTML document.

Useful attributes:

dir attribute
This attribute specifies in which manner the browser will present text within the entire document. It can have values of either ltr (left to right) or rtl (right to left). By default this is set to ltr. Generally rtl is used for languages like Persian, Chinese, Hebrew, Urdu etc.

Example: <html dir="ltr">

lang attribute
The lang attribute generally specifies which language is being used within the document.

Special types of codes are used to specify different languages:
en - English
fa - Farsi
fr - French
de - German
it - Italian
nl - Dutch
el - Greek
es - Spanish
pt - Portuguese
ar - Arabic
he - Hebrew
ru - Russian
zh - Chinese
ja - Japanese
hi - Hindi

Example: <html lang="en">