Authoring Webpages/Adapting a webpage for visual browsers

Introduction

edit

The HyperText Mark-up Language (HTML) allows you to add structure and hyperlinks to a text. This in turn allows a web browser to display that text in a useful manner to the user. The mark-up you use has little or nothing to do with the display of the text: this makes it possible to display the hypertext on a wide array of devices.

The web browser has to make a translation between the mark-up you provide and display properties. For instance, a heading can be displayed in bold, large text on a graphical browser, and can be spoken out loudly in a speech browser, et cetera.

Generally, manufacturers of web browsers make good choices. As well-trained readers, we know that a bold large text on its own line over a mass of normal sized text is probably its heading. So when a browser renders a heading like that, we tend to recognize it as a heading without even thinking. List items are preceded by list item markers such as numbers or 'bullets', emphasized text is printed in bold or italics, et cetera.

Manufacturers of graphical web browser even consider that reading off a screen is hard: paragraphs are divided by empty lines, text is generally printed (by default) using a relatively large font-size, and hyperlinks are clearly marked as such, by using different colors and underlines or surrounding lines (the latter in the case of images).

There are two things though, that current web browsers are getting wrong. One is that they display each and every webpage the same; the other that they generally fail at presenting webpages in an aesthetically pleasing way.

It can certainly be argued that the browser manufacturers cannot be blamed for these problems.

Excercise 4-1

edit

Why can manufacturers of web browsers not be blamed for rendering all webpages alike? Why can they not be blamed for displaying webpages in an aesthetically unpleasing way? You may also argue the reverse, if you like.

One of the few clues a visitor might have about the quality of webpages is provided by a concept called website. A website is a collection of webpages that belong together. Because they belong together, they provide a powerful hint to the visitor that information that was promised on one page of the website, may be found on another page of that website. Similarly, if the visitor accepts the voice of a webpage author as an authority, other webpages by that author may provide an interesting target for further visits.

There are several ways in which an author can make clear that webpages form part of one overarching website. One of these ways has already been discussed: by using a sensible title text, authors can show that pages belong together. The title of this page, "Adapting a webpage for visual browsers - Wikibooks", helps to underline this. All pages on the Wikibooks website have a title that ends with "- Wikibooks". If you feel like reading other textbooks, the message is clear: you should be on this site.

Another way to suggest a relation between webpages, that is, to suggest a website, is in the strategic use of the Uniform Resource Locator (URL), the address of a webpage that is often displayed in a webbrowser's address bar. You can use similar addresses for similar webpages.

The third way to suggest a relation between webpages is probably the most useful, though: you can use a uniform visual style to indicate to visitors on which website they are.

A bit of history

edit

Originally, HTML was a mish-mash of graphical display mark-up and structural mark-up. This sounds perhaps worse than it was: only a few elements were reserved for visual presentation, such as B for bold text, and I for italicized text. Further, PRE allowed an author to display text using 'plain text formatting' (as discussed before), BR forced a graphical browser to start displaying following text on a new line, and IMG let you display an image at a certain point in the text.

All this wasn't so bad, really. It did not introduce a huge drop in accessibility, but allowed authors of pages for graphical clients (more or less the default since the beginning of the web) to 'ponce up' their pages and make them more attractive to visitors.

However, by allowing display-only elements in an otherwise display-independent language, Tim Berners-Lee opened the door for abuse by browser manufacturers.

The possibility to create a visually pleasing web page made the web a more attractive place to be. Just like the DTP revolution made everybody think that they were graphical designers (mistakenly, most of the time), so did the graphical web open the door for letting form rule over content.

When the web became more popular with users, it of course also became more popular with businesses. Actual businesses were founded to produce web browsers, something almost unheard of before. The most successful of these was Netscape.

Browser wars

edit

Netscape quickly recognised that the instant appeal of pretty webpages was just as strong a selling-point for a web browser, if not stronger, than the other, more solid appeals, such as the promise of becoming one's own publisher, or being able to traverse a associative landscape of ideas.

However, Netscape only had control over the browser, not over the web itself, or its underlying HTM Language. So what Netscape did was let its web browser, Navigator, recognise an extended version of HTML. Frames, for instance, are a Netscape invention, as is JavaScript and the FONT element. Using this strategy, Netscape Navigator soon became by far the most used browser on the web. With it the web (and with the web the internet) became a public space, rather than the academic space it had been before.

The one software company that had thoroughly missed the internet boat, and that was Microsoft. To the day of writing this, Microsoft still does not understand the internet: they don't see it as a space within which to operate, but rather as a thing to be owned. In the beginning they even tried to replace it with its own 'internet', called MSN; which is of course nonsense, because the internet is not an atomic network. It is a network of network. MSN was to be part of the internet, much to the chagrin of Bill Gates and his people.

Meanwhile, Netscape looked further ahead. For them, the internet was a vehicle to gain control over the 'desktop', the metaphor 1980s' operating systems use to describe themselves. If everybody had Netscape Navigator installed, and Netscape Navigator was this universal tool that could be used for everything, from playing games to word processing, it did not really matter which operating system was running beneath the browser.

Microsoft was, through its original disdain for the internet and the web, suddenly threatened in the core reason of its existence. It then made a strategic decision that pulled it right into the centre of the internet: it would build its own browser.

Microsoft's browser was called Internet Explorer, and it started the so-called browser wars. Microsoft started playing Netscape's game. It first supported almost all of Netscape's HTML extensions, added a few of its own, and changed the behaviour of some elements in a way that made them work slightly better.

The web was under a threat of balkanization: the dividing in parts so small, that dealing with the web, and especially authoring web pages, was to become an ordeal that would almost prove too much for authors and surfers alike.

Authors suddenly had to decided which browsers to support, or if they should support specific browsers at all. The possibility to treat the web as a purely visual medium planted the misguided thought in a lot of heads that web pages should look alike everywhere.

This was when the World Wide Web Consortium stepped in, and started rallying for stricter standards, and a division of document structure and document lay-out. The smartest move of the W3C was to get the browser manufacturers on board. With Microsoft and Netscape both having a direct say about how next versions of HTML would look like, they became stakeholders with an interest in creating a useful web language.

The W3C introduced a programming language for suggesting certain lay-outs to a web browser, called CSS. That abbreviation stands for Cascading Style Sheets. Stylesheets had a couple of things going for them. Most importantly, they promised a 'code once, view everywhere' approach to web authoring. This was good for the W3C, because that approach was what HTML was about in the first place. And it was good for the authors, because now they did not have to learn about the quirks of every web browser.

Other advantages of CSS were from the start:

  • They could be stored in separate files, so that the style for an entire site could be stored in one stylesheet;
  • They allowed for chaining ('cascading') stylesheets, so that part of a website could have its own distinct style from the rest of the site, while still looking part of the site; and
  • They enabled a few tricks that Netscape and Microsoft hadn't gotten to yet, such as more control over text styles.

Future

edit

It could be argued that allowing a few graphical display elements in a hypertext language was a good move. It popularized the web, and with it the internet. It introduced a great deal of authors to the web. However, today many of these authors see the web as a visual medium rather than the (hyper)textual medium it really is. Weaning those people from their wrong notion may prove an impossible task.

Websites and home pages

edit

Websites

edit

As we saw before, webpages form part of the web, because they link to other pages or because other pages link to them. In other words, web pages do not live in a vacuum. When you receive a folder in a letter box for a pizza delivery service for instance, there is no context. You do not generally receive copies of competing services simultaneously, or a map that shows you where the delivery service is located, or an encyclopaedia that tells you about the history of pizzas.

A webpage does come with such a context. For instance, if you visit a webpage that lets you order pizzas, you probably visited a page before that which let you choose from several pizza delivery services. Also, the order page may link to a map, or to an article about the history of pizzas. Even if it does not do so, the web browser may provide additional functionality to you. For instance, if you selected the name of a pizza in the Firefox web browser, then right-clicked, a menu item would appear that would let you search Google for the selected text in a new tab in the background.

Although all webpages have a such a context, only the ones provide by you are under your control. When you group a number of related webpages, such a grouping is called a website. Webpages may form a website for any number of reasons; because they belong to the same subject, because they are hosted on the same server, because they live on the same domain, or because the are created by a single person or organisation.

For instance, the collection of webpages at http://www.nasa.gov form the website of the US national space agency NASA. They may not all be served by the same server, and they are not created all by the same person, but they live on the same domain, try to be the voice of NASA on the web and deal with topics that are all related to NASA.

A person or an organisation may of course have multiple websites; and what is called a website in this respect is not really important.

Websites are often characterised by

  • a main site navigation ("menus" of hyperlinks that are repeated on every webpage and that lead to important sections),
  • a coherent visual style across webpages, including a logo and a favicon,
  • coherent page names, for instance by repeating the site's name in the title element,
  • a natural division of the website in topics and subtopics.

All these clues tell you after going from one page to another that you are either on the same website, or have left for another website. Most web authors get these clues at least partially right, and most web page visitors are able to tell which website they are on.

An interesting feature of a webpage is then that it contains information that is at the core of what the author wanted to say with that particular webpage, but that it also contains information to the visitor that they are on a particular website, where they are on that site, where they can go, etc.

Most webpages will contain a lot of information that are pertinent to the subject of that page, and a little information about the website itself. The exceptions are called home pages; home pages are about the website they are part of.

Home pages

edit

The "homepage" is the main page of a website. It is used as a central hub for the rest of the site. This is the file that is displayed when going to a web address that doesn't specify a document, e.g. http://www.example.com/. It is typically named index.html, index.htm, default.html, or default.htm.

A homepage has a number of functions and a number of rules and heuristics that it should adhere to. The functions of a homepage are to:

  • provide a navigational aid for the site
  • provide information on the site's theme
  • establish the brand identity of the website (e.g. the site pages' appearance)
  • provide a means for visitors to reorient themselves
  • provide the minimal location for a web page author to link to
  • live at an easy to remember location

Let's review these.

edit

When a visitor follows a information scent to your web page, the web page may or may not fulfill the information wish of the visitor. In the latter case, a visitor will either want to track back, or follow further scents.

As we have noted, the indication that a webpage is part of a website provides a powerful hint to the visitor that the same voice that wrote the webpage has written other webpages on possibly similar subjects. A visitor who does not succeed at the current webpage, or who has succeeded, but now has changed goals, may want to further explore your websites.

For example, if you collect jokes, and a visitor thinks the first of your jokes is funny, they may wish to read more of your jokes.

Links on the webpage to related webpages can be very useful; but sometimes these links are lacking, or they are worded in a way that does not help your visitor, or they do not make clear that they lead to parts of the same website; and often a webpage does not show the intentions an author has with a website, the freshness of a website et cetera.

A homepage should provide this sort of information, or at least leads to it. A homepage of a larger should also provide alternative ways of navigation. This could be done through:

  • a search function
  • a main menu
  • a catalogue
  • highlighted webpages

A common way to link to a homepage is to use the logo or site name as a hyperlink, or to link to it from a breadcrumb trail.

Site style

edit

In order to find out that you are on a website, you need to have visited at least two webpages on that site: one to establish that a certain style was used, another to verify that style. Any webpage on a site can be used for that second function, but the homepage must always be usable in this way.

Site meaning

edit

The homepage of a website should always make clear what the website is about, just like a webpage should always make clear what that webpage is about.

News

edit

In lieu of other designated places where a visitor can review whether pages of a website have been added or updated, the homepage should be regularly updated to indicate that a website is still "alive". Popular ways to do this are to show leads to recent news item, to regularly change highlighted items, or to regularly make simple changes to the lay-out of the homepage. Other ways are to introduce seasonal elements into a homepage. For instance, you could use your homepage to wish visitors happy holidays.

Other designated places that allow you to assess the liveliness of a website are the news page, or for example the Recent Changes page at Wikipedia. You can use these pages instead of the homepage for situational feedback, as long as it is clear to visitors that they need to look somewhere else, for instance because you clearly link to a news section.

Location

edit

In general, you will wish to let the world know about specific webpages, instead of your website. However, there are instances when the latter is desirable. For those instances it would be useful if the homepage were easy to find and to get to. One way to achieve is, is to locate the homepage at the shortest possible URL. If your webpages are at http://www.example.com/~wily/friends.html, http://www.example.com/~wily/album.html, and http://www.example.com/~wily/contact.html, your homepage should be at http://www.example.com/~wily/. (When faced with a request for a directory instead of a file, a webserver will generally start looking for files with a certain name, such as index.html, index.php, welcome.html etc. This behaviour differs from webserver to webserver, but index.html is usually a pretty safe file name to give to your homepage.)

Also, when you forget to link to a homepage, or when the link cannot easily be found, visitors will apply a trick called the directory traversal attack. Despite its goulish name and the fact that it is a crime in the UK, this is a perfectly moral and fine thing to do. It works by guessing the parts of a URL that are superfluous for the homepage address.

For example, if you are at http://www.example.com/~wily/friends.html, removing "friends.html" or "~wily/friends.html" from the full address in the address bar of your web browser, may lead to the homepage of this webpage's site.


Recapitulation

edit

In essence, a homepage provides situational knowledge about a website. It should show a visitor what the site is about, what its main themes are, how you can get there, how fresh a site is, etc.

Cascading Style Sheets

edit

Stylesheets and the style element

edit

CSS (Cascading Style Sheets) is a language used to "style" markup, such as HTML. CSS is a series of rules, and each rule has three parts: a selector, a property, and a value.

 a {
   color: red;
   font-style: italic;
 }

In this example, "a" is the selector. It selects all anchors ("links") in the document. Each rule that affects anchors is enclosed in the brackets ({}) following the selector. Here, the two properties are "color" and "font-style". "color" is used to set the color of the text, and "font-style" is used to set the variety of the font. Anchors will appear as red, italicized text.

There are three methods of adding CSS to a web page, but the third is considered the best for most purposes.

1) As a tag attribute. The name of the attribute is "style".

<a style="color: red; font-style: italic;">Example link</a>

This is considered the worst way of adding styles, in most cases. The reason for this is that it's very hard to maintain. If you wanted to change the color of the anchors on your site, you'd have to find and replace the attribute in every "a" element in every web page, which could occur hundreds or even thousands of times, depending on the size of the website.

2) As a style element. This element is placed inside the head tag.

 <stlye>
   a {
     color: red;
     font-style: italic;
   }
 </style>

This is considered the second worst way of adding styles, in most cases. While not nearly as difficult to maintain as the first method, you'd still have to synchronize all of the style elements across all documents, and as your site grows, this would become more and more awkward.

3) As a linked document. The document ends in .css (e.g. "styles.css"). This is considered the best way, in most cases, because it is the easiest to maintain. To link your web page to your stylesheet, add this in your head tag:

 <link rel="stylesheet" href="styles.css" />

Now, if you want to change any part of the anchors' appearance, you'd only have to edit one file, and it will affect all pages on the website that have that link in the head.

Typography on the web

edit

Empty lines separating block level elements--line width----losing control--practice: font family--practice: italics, bold, font-size--practice: line-width--practice: line-height.

Colours: dangerous and beautiful

edit

Easy branding with colour--danger: colour blindness--danger: browser settings--solution: never use just colours--solution: always define all colours--practice: applying colours to text and links--practice: applying colour to backgrounds--practice: applying coloured borders.

Preparing for print

edit

The print style sheet.

Questions and Exercises

edit

Answers

edit

For answers, see Answers to Questions and Exercises.


Previous: How to write for the web - Up: Table of Contents - Next: HTML, XHTML and DOCTYPEs