Virology/On the Genomes of Viruses

Previous page
Introduction
Virology Next page
On the genome of viruses


First, lets have a short introduction to how the instructions of biological processes are stored, accessed, and executed. For all known life, genetic information is stored in the DNA of the organism's genome. DNA has four different types of bases, that can be arranged in a specific sequence that encodes information (we have 26 letters in English, 10 digits in decimal, and our computers have ones and zeros but life only needs four to encode almost all of its complexity). DNA pretty much only functions as information storage, so it needs to be stable and protected from dramatic changes in order to serve its purpose but it doesn't do much else besides that. Since DNA can't do much on its own, RNA acts as an intermediate that can carry the information of DNA (so the information can be used) and it also has more functionality at the expense of stability (it is broken apart all the time). RNA also has four bases (that can pair up with DNA bases - copying the information), but it still can't do much for complex biological processes. Luckily, proteins can take care of complex processes since they are made up of things that have a greater variety of properties - amino acids. For the most part, life uses 20 amino acids to make all of its different kinds of proteins (more than enough to put together all of our [human] 30,000 proteins). In order for the genetic (DNA and RNA) information to encode the information for 20 different amino acids with four bases, the bases need to be grouped together in threes to form codons (4^3 = 64 possible arrangements to encode 20 amino acids) that can be read to be translated into amino acids (the letters of protein). This idea of DNA -> RNA -> protein is classically referred to as the central dogma of molecular biology (I really don't like the name, since it doesn't fit with the connotative meaning "dogma" has now).


Viruses are not living and are the exception to a lot of "rules" that life follows. An example of this is that some viruses use RNA to make DNA (reverses part of the central dogma - retroviruses). Viruses are classified into different seven groups by how they store their genomic information:

Class I: dsDNA (double stranded DNA) viruses - theses carry information like us - an example is herpes simplex virus (causes herpes)

Class II: ssDNA (single stranded) viruses - like us, but only one strand that doesn't have its complement - an example is canine parvovirus (kills dogs)

Class III: dsRNA viruses - not really like us at all (our cells freak-out when they see dsRNA) - an example is rotavirus (causes the stomach flu)

Class IV: (+)ssRNA viruses - looks like the intermediate between our genome and protein (the plus sign indicates that it reads forward "left to right" and can be translated directly into protein) - an example is hepatitis C virus (causes liver cancer)

Class V: (-)ssRNA viruses - not like us at all (the negative sign indicates that the RNA message is backwards and has to be copied to make a forward reading sequence that can then be translated into protein) - an example is ebola virus (causes hemorrhagic fever)

Class VI: ssRNA-RT (reverse transcribing - RNA->DNA) viruses - these get stuck in our genome, so part of us looks exactly like them - an example is human immunodeficiency virus (causes AIDS)

Class VII: dsDNA-RT viruses - even though these viruses start out with DNA they like to use RNA to make the DNA they use (like using an ASCII translator to write something you need to remember, that you then have to use an ASCII translator to figure out what you wrote down on the note) - an example of this is hepatitis B virus (also causes liver cancer)


I'm not going to go into it right now, but viruses can have their genome in a circle (like bacteria) or have their genome broken into parts (influenza has nine parts).

The last thing that I am going to mention right now is that viruses are usually very efficient with their genomes and can have overlapping sequences, sequences that they change specifically in order to achieve some function, or not even have all the information they need (they wait for other viruses to help them replicate).