Authoring Webpages/Creating a simple page
Creating a simple webpageEdit
Time to get our hands dirty! (In a manner of speaking.)
In the first chapter, which stated the requirements for following this course, a small exercise was printed, in which you created your first simple web page. If you haven't done that exercise yet, go there now and do it.
Text entered into a text editor and saved to a hard disk, or other form of storage, is often called a "plain text file". Plain text files generally provide three small ways for separating text. Tabs and spaces are used for separating words and returns are used for separating paragraphs.
People have found creative ways of producing intricate layouts using just these small methods of markup. In web pages, however, this method of laying out text doesn't work. A web browser will collapse all consecutive spaces, tab stops and returns into one single space or soft return, depending on where on the line the word occurs. HTML does layout a much different way, which will be discussed later.
When you structure a text, you generally do so to make it easier to digest and to read. By making chapter and section headings more pronounced, you allow the reader to skim over a text until they find an especially interesting part. By using introductions and abstracts, you allow a reader to decide if this text will be interesting to them. You can use illustrations, because sometimes people will much sooner understand what's going on when they can see what's going on.
HTML, the HyperText Markup Language for the web, and its successor XHTML, the eXtensible HyperText Markup Language, allow you to impose a structure on a plain text document. It replaces the few simple mark-up methods plain text allows you by its own. (Note: XHTML has been superceded by HTML5.)
The way HTML lets you do all this, is by letting you label certain parts of the text as a heading, a paragraph, an image, a table, a list, etc. Some structures are not supported by HTML, since it is a relatively simple markup language. For instance, there is no element for introductions or leads. The reader will have to infer from the position in a text which is the introduction, the lead or the abstract.
Each part of an HTML document is called an element. Elements are separated from each other by a type of label called tags.
<p>This is an <em>important</em> example.</p>
What you see above is a paragraph element with an embedded emphasis element. The paragraph element starts with a <p> tag and ends with a </p> tag. The emphasis element starts with an <em> tag and ends with an </em> tag.
A web page by any other name...Edit
There are several versions of the HTML standard. The most current is HTML5. However, web browsers should also support older versions of the standard, so that examples such as the one given in the Requirements chapter are considered valid HTML documents.
Every web page must have a
title element (under older versions of HTML this was the only required element):
The title is the text by which the browser window will be named. It is the text that appears at the top of your browser window, and it is the default name of the bookmark (or favorite) used by the browser. It is also the title by which the page will be listed by a search engine.
Find a descriptive title. The heading of your page will often do just fine. The heading of this chapter is "Creating a simple web page", so its title could be the same text. Since this web page explains to you how to create a simple web page, that would be an excellent title.
Bad titles abound on the web. For example, a company might name their page
<title>Big Fridge Manufacturer Inc.</title>
If you are lucky, they will even add in a bit of information related to the web page you are visiting, for instance:
<title>Big Fridge Manufacturer Inc. - Manual of the Cool 3000 ice box</title>
Better for the visitor is:
<title>Manual of the Cool 3000 ice box - Big Fridge Manufacturer Inc.</title>
After all, it is much more likely that the visitor who gets to this page is searching for the manual, rather than for company info.
So, the lesson is: always put the important information first. Often, you only have a limited amount of characters available to you in your bookmarks menu, in the search engine listing or in the window title bar; utilize that space to the maximum.
An HTML document with only a title element is not very useful. We will now introduce you to a couple of elements that will allow you to make good use of 90% of the power of the web.
- The name of a page.
- The most important heading(s) on a page, often the same as the title.
- A sub-heading.
- A paragraph.
- A link.
With these elements, we can make the following simple web page:
<!DOCTYPE html> <html> <head> <title>Friends and family of Clemence Wylie</title> </head> <body> <h1>Friends and family</h1> <p>The following are links to the websites of my friends and family</p> <h2>Friends</h2> <p><a href="http://www.tomsawyer.us">Tom Sawyer</a></p> <h2>Family</h2> <p><a href="http://www.tantejeanette.ca">Aunt Jeanette</a></p> </body> </html>
Copy the above sample code to your text editor. Save it as exercise2-1.html. Open it in a web browser. Does it display like you expected?
For answers, see Answers to Questions and Exercises.
a tags in the previous example have a special purpose. They create anchors. (Anchors are more commonly referred to as "links".) They contain attributes with attached values. Attributes are part of a tag that give the browser additional information about the element. Each attribute is followed by an equals sign (=) and a value in quotation marks (").
a tags have several possible attributes. The most important are
href is the attribute that defines the URL (Uniform Resource Locater) or URI (Uniform Resource Indicator) (more commonly known as an "address"). This is the destination that the link leads to: another document, or a location within the same document. Commonly, addresses are called URLs; however, this practice has become deprecated, and it is now recommended that you use the broader term "URI", instead.
id is a unique name for the link, which can be used by other links to refer to it.
(Note: Previous versions of HTML used
name instead of
URIs typically have the following form:
The protocol for web pages is usually
http (HyperText Transfer Protocol) or its secure variant
https. In this example, the link would take you to the
section2 section of the
books.html document on the
www.example.com domain, using
Note, however, that most parts of this are optional, depending upon how you want to use the URI. URIs can be relative or absolute. An absolute URI will include the domain as part of the address. (The
href can just be a relative path (for example
wines/french/red/bordeaux.html): in that case, the address will be calculated from the page that contains the link.
href can also just be a domain name:
http://www.example.com/ leads to a website with that address; the web server of that site is supposed to figure out which document you want. This typically defaults to
An HTML document consists of elements. These elements are constructed as follows:
An opening tag may contain attributes. Attributes often have values.
<tag attribute1="value" attribute2="value2" attribute3>
The tag that closes an element is just like the opening tag, but has a slash in front of the name, and cannot contain attributes:
Some elements cannot contain other elements. The HTML standard defines which elements can be contained by an element. The permitted combinations vary from version to version.
Elements are either block-level elements or inline elements. With block-level elements, the browser sets off the element in its own "block". It has a return placed both before and after it. Some examples of this are headings (
h3, and so on), paragraphs (
p), and list items (
li). Inline elements are not treated this way, so (for example) they can be inserted into paragraphs without disrupting the flow of the paragraph. Good examples of this are anchors (
a), emphasis (
em), and images (
Block-level elements can contain inline elements, but inline elements cannot contain block-level elements.
For instance, the following is valid HTML:
But this is not:
Invalid: <a><h1>invalid HTML</h1></a>
The term "valid HTML" has already been mentioned a few times. Since web pages are authored by people, and people make mistakes, web browsers tend to be extremely forgiving towards those mistakes. They will even try to correct your mistakes.
Still, there are several reasons why you should try to mark up a web page with valid HTML:
- different browsers may correct your mistakes differently
- future browsers might not be as forgiving
- valid HTML is easier to read and maintain
- when trying to correct bad markup, it helps if you are not side-tracked by other possible errors
The organization responsible for maintaining the HTML standard is the World Wide Web Consortium. It runs a validation service that you can use to check if your HTML is valid. You can find it at http://validator.w3.org. It is a good practice to validate your HTML with this service.
A common mistake is forgetting to start every document with a DOCTYPE. A document without a DOCTYPE is automatically invalid (HTML version information). Note: many texts erroneously state that the DOCTYPE is optional. It is true that all major browsers will forgive the absence of a DOCTYPE, but this does not make the page valid. The appearance of a page may vary noticeably between different browsers if the DOCTYPE is omitted, because each browser has its own peculiarities when rendering such pages.
Time to have some fun. The following exercises will let you make some simple web pages and websites. The goal is to teach you the power of several different ways of linking.
Copy the example web page above to the clipboard and open http://validator.w3.org. Paste the example in to the 'Validate by Direct Input' section and click on 'Check'. Is the example valid?
If you have an anchor
<a id="anchor1"></a>, then
<a href="#anchor1">link</a> will link to it. That means that when you activate the link, the web page will be displayed starting at the anchor (rather than as usual from the top).
Make a copy of the web page you created in Exercise 2-1 and save it as
exercise2-3.html. Change this file to include a 'menu' of anchors at the top that link to the headings of the different subsections (Family, Friends).
There is a hybrid form of book and game called Choose Your Own Adventure (CYOA). In such a game-book, you read a bit of text as in a normal book, but after a while, you get to make a choice as to how to continue.
You are sitting in the tub, soaking and relaxing. Your rubber ducky is chattering away happily when suddenly a pike grabs it from below and drags it down.
- If you dive into to the water to save the ducky, go to page 89
- If you pull the plug to empty the bath, go to page 24
In this exercise, you will write a short CYOA, in which the choices are represented by anchors that will lead to the text continuing from that choice. Every "chapter" must be a separate web page.
Keep it snappy and don't spend too much time on this. Ten to twenty web pages should be sufficient. The story does not need to be good or finished.
Tip: Create a template HTML file which you can use to base all subsequent chapters.
Create a web page and save it as "exercise2-5.html". The web page should contain a short, informative text about a subject of your choice. It should contain at least three working links to external websites about the subject.
For answers, see Answers to Questions and Exercises.
Including an image on a webpage is done using the
img is one of a class of elements referred to as "self-closing". Self-closing elements don't have a closing tag. Instead, they end with
/>. (In HTML5, the slash is optional; however, it is considered a best practice to include it.)
img element has two obligatory attributes:
src takes a URI as its value. In this case, the URI will be the "address" (location) of the image.
Since URIs can be relative, if the image is located in the same folder as the web page that includes it, the URI can consist merely of the file name of the image. More commonly, the image will be located with other images in a directory called
img. It is a good practice to keep your images in a separate directory, because it will make your site better organized.
alt attribute contains a textual description that appears when the image cannot be displayed. For instance, if the image is a photo of a lake with a castle, you could have the following code:
<img src="img/lakecastle.jpeg" alt="photo of a lake and a castle" />
When the purpose of the image is decorative, you might want to use an empty alt value. That way, when the page is displayed, the "decorative" text will not interrupt the flow of the page's main text.
<img src="img/prettypattern.jpeg" alt="" />
However, when the image has a function to fulfill on a webpage, the presence of alt text is very important for visitors who can't see the image. For instance, many webpages have navigation built from menus of links, where the links are represented by images. If the images can't be displayed and there's no alt text, users won't be able to use the navigation.
<a href="family.html"><img src="img/button-family.png" alt="Family" /></a>
img element lets you embed an image on a page. You can of course also link to an image that you do not want to display on the page, because it has no role there. For instance, if you want to offer people the chance to download photos you made, you can offer links to those photos. For that you use the same
a element that we have been using to link to other webpages:
<a href="img/lakecastle-large.jpeg">Photo of a lake and a castle (JPEG, 512 KiB)</a>
Note how you can create links to every file that can be located using a URL. By indicating that a photo is stored in the JPEG format (a very common file format for photos) and by indicating the file size, we give visitors the opportunity to decide whether they A) can use a file of this format and B) whether they are willing to download a file of this size.
HTML contains many more elements (for example, HTML 4.01, contains 91 different elements), but for now we will discuss only one more before moving on to the style of web-writing.
pre element allows you to retain plain text formatting (as discussed shortly at the beginning of this chapter). This means that within the element, consecutive spaces, tabs and hard returns will not be collapsed into a single space or soft return.
There is little use for this element. It stops the text from reflowing neatly when the browser width is reduced or expanded, causing visitors to scroll horizontally, which web-surfers generally hate to do. HTML and its companion layout language CSS have plenty of options to display line-breaks and indentation. Also, it is pretty meaningless in non-visual browsers.
However, when you wish to copy pre-formatted text from other documents, it may be handy to use the
pre element until you have the time to mark that text up.
<pre>1 2 4 8</pre>
Later during this course, we will discuss further elements. However, the intention of this course is not to make you fluent in HTML; it is to make you fluent in authoring webpages.
Generally, to fully comprehend something requires that you fully comprehend its form first. You cannot be a successful karateka if you cannot perform the various moves. You cannot be a successful French speaker if you have not mastered its grammar and vocabulary first. However, knowing all the ways to hit someone does not make you a good karateka, and knowing all the words and rules of the French language does not prevent you from becoming a mumbling baboon the next time you need to speak French.
To fully comprehend authoring webpages, you need to look beyond the language in which you write them. This is what we will do in most of the further chapters of this book.
The official HTML5 Recommendation of the World Wide Web Consortium can be found here. Although this documentation can, at times, be pretty hard to read, it represents the last word on any discussion of what is valid HTML5 and what is not.
Further, one the authors of the HTML 4.01 Recommendations, Dave Ragett, has written a couple of handy guides to HTML and its companion layout language CSS, which you can find at http://www.w3.org/MarkUp/#tutorials. If there are points in this and later chapters that you do not fully comprehend, you could do worse than study Dave's texts. They are much clearer than the official specifications, and short enough to study alongside this text.
The following exercises are optional. You can use them to practice putting images on your webpages.
Download the following images, and put them all on a webpage that you will save as 'exercise2-6.html'. Think of useful
To download linked files, a lot of browsers contain a Save Link As function. In graphical browsers using a mouse, this function is often part of the context menu. On the PC this means you have to click you right mouse button, on Mac OS this means you have to press the
Ctrl-key and press the mouse button.
(images to follow later)
For answers, see Answers to Questions and Exercises.
- "The Difference Between URLs and URIs". Daniel Miessler. https://danielmiessler.com/study/url-uri/