Cell Biology/Genes/Gene expression

Gene expression is the first stage of a process that decodes what the DNA holds in a cell. It is the expression of a gene that gives rise to a protein.

How does gene expression occur?

Genetic expression is a complex process. It is regulated by a series of mechanisms.

Gene expression begins with transcription of DNA, giving rise to messenger RNA (mRNA). This is performed by the enzyme RNA polymerase, which produces the mRNA. The mRNA in prokaryotes is coupled with several ribosomes which are responsible for translating proteins.

In eukaryotes, mRNA that is made from DNA is immature, and is called pre-mRNA. Pre-mRNA loses non-coding sections (called introns), maturing to mRNA. mRNA is coupled to ribosomes on Rough Endoplasmatic Reticle (RER) where translation happens. Translation is made when a new polypeptide is formed. The genetic code indeed says the order of pe polypeptides, but it doesn't give us a clue about it's tridimensional structure. Tridimensional structure is given by post-translational processes.

Translation occurs following transcription wherein the protein synthesis machinery gets into action and uses its tools to read out the message that the RNA holds.

There are some genes known to be without coding proteins. Yet, they work as regulation sequences in a cell. In this case, the sequences can enhance coding (called "enhancers") or they can inhibit (called "repressors"). When a protein is coupled with these genes, a substrate or hormone, they join together.

In multicellular organisms only certain cells will produce a certain type of protein; e.g.: Haemoglobin is encoded in every cell of a mammal organism (it includes humans), but only precursors of red blood cells are allowed to express it (red blood cells are not allowed to express it, because they lose their nucleus). However, the enhancers and repressors are present in every cell of a mammal.

Genetic Information

In nature, there is information found in all living cells. Different cultures have often studied this information and used various forms of recording techniques to display it. Ancient Egyptians, in particular, referred to this information and its records as "provider of attributes" and determined it ||| to mean several, and that was earlier in human history of recording something that was known about nature.

There were often other signs as well that accompanied Egyptian writings on the source of this "information key of life". Among them were double, water and wick of twisted flax. But the most central one, for modern science, of course, was the snake like determinative that meant a worm or serpent in the limit of life. This limit, water, was "N" meaning that something or someone is, the essence which would be referred to by the Greeks as "esse" or "ens", and in today's English terms, the "essence".

In anthropology, the language of gene expression is rooted in the sources of knowledge that Odhiambo Siangla of Kenya has called "rieko" and Jeremy Narby of Switzeland has termed the "cosmic serpent". Both Siangla and Narby are not only experts in cultures but are trained in communication and expression. And from both the key has been the "three letter word".

In the alphabet of the three letter word found in cell biology are the organic bases, which are adenine (A), guanine (G), cytosine (C) and thymine (T). It is the triplet recipe of these bases that make up the ‘dictionary’ we call in molecular biology genetic code.

The codal system enables the transmission of genetic information to be codified, which at the molecular level, is conveyed through genes.

What is gene ? A gene is a region of DNA that produces a functional RNA molecule. If a region of DNA is not functional, that region is not a transmissible form of information for protein synthesis. And because the information is not transmissible, it is not readily functional. There are various sizes of gene. The first recorded attempts to imagine the very small was the Horus Eye, which is also a pristine idea of limit. Today we talk about bases. The insulin gene, for example, has 1.7 x $10^{3}$ , about 1700 nucleotides. There exists a receptor gene known as low-density lipoprotein (LDL). This protein has 4.5 x $10^{4}$ nucleotides. In terms of nucleotides this (LDL) approximates to 45,000 nucleotides. Now, with the dystrophic gene as another example, we find the nucleotides to be around 2.0 x $10^{8}$ , approximately 200,000,000 nucleotides in number.

Now, the introns. It is the noncoding regions of DNA that are called introns meaning the “intervening sequences”. Introns make up a greater part of the nucleotide sequences of a gene. The coding regions are called exons to mean “expression sequences”. They constitute a minority of the nucleotide progression of a DNA and they instruct cellular workshops for the formation of proteins via amino acids.

Through proteins, the expression of genetic information is achieved. In particular are the enzymes. Even during the ancient time the enzymes were understood and utilized well. The enzymes catalyze the chemical reactions of anabolic kind, that is, the building of cellular food and those of catabolic type, the braking down of food. The two processes are collectively termed metabolism. What, further, can we add about proteins?

We can further say that proteins are concentration of heteropolymers manufactured from amino acids. There are 20 amino acids used in synthesizing natural proteins. It is clear that a protein may consist of many, in fact, several hundred amino acid sediments. It is essentially unlimited in number to speak about how many different proteins we can make from combinations of amino acids. Mathematics explains it well. There is therefore a diverse set of proteins whose forms and functions can be achieved by means of a coding system explained below.

Genetic information flows unidirectional, from DNA to protein and with messenger RNA (mRNA) as intermediate. First, DNA encodes genetic information into an RNA molecule. This is called transcription (TC) of the information. Then the information gets converted into proteins, being named here translation (TL). It is this concept of information current that is called the Central Dogma of molecular biology. The Central Dogma is the fundamental theme in our exploration on gene articulation.

In order to complete the picture, we can add two further aspects of information flow. We can add duplication of the genetic material, which occurs prior to cell division. And that a DNA, in this case, represents duplication process, —DNA transfer. Wherefore in this case it is known as DNA replication. But where some viruses have RNA instead of DNA as their genetic material, we speak about reverse transcription (RT). With this transcription, we get a DNA molecule as a copy of the viral RNA genome.

In other words, genetic information, whether historically traced world wide (Narby,1998) or particularly assigned to ancient Africa (Siangla,1997) involves gene expression. Both DNA and RNA are polynucleotides, where nucleotides are the monomer—building units, which are composed of three basic subunits called nitrogenous base, sugar, and phosphoric acid. Genetic information is contained in DNA. The genetic code in DNA expresses the connection between the polynucleotide alphabet of four bases and 20 amino acids. In one strand of the parental DNA molecule, there is a dictated amino acid sequence strictly for protein production.

We will discuss in the next few postings, a relatively detailed understanding of the polymerization of amino acids sequence as directed by base sequences of messenger RNA.

At the moment, though, let us note that protein synthesis is an expression of genetic information. Protein synthesis is the cellular procedure, as we have said, of making proteins and involves two main processes: Transcription and Translation. The two processes mean that the direction of the synthesis is from DNA to RNA and then from RNA to protein respectively. Is this true to all organisms?

Yes. With a few exceptions, which are in mitochondria, and as stated above, some viruses become exceptions to this order because in their genetic material, they have RNA instead of DNA as their initial information source. However it is true that in all organisms, methods that relate the nucleotide sequence in messenger RNA to the amino acid sequence in proteins (genetic code proper) are the same. For in the given exceptions there occurs reverse transcription (RT). With that viral example of transcription noted, we get DNA molecular information being copied from the genome of viral RNA.

Building on this clue that is provided by transcription processes, we can readily see that a three-nucleotide sense codon denotes each amino acid. For example, UUU specifies phenylalanine, UCU specifies serine and GCA specifies alanine. But UAC and UAU both specify tyrosine. We will speak more about this tyrosine when expanding cell biology in the study of melanin.

Here now are other ways to see the remaining three properties of the genetic code. One is the contiguous property. With this property the codons do not overlap and at the same time they do not separated by spacers. The other is degenerate property in which there is more than one codon for some amino acids as exemplified by tyrosine in the above paragraph. And finally, there is the unambiguous property. With this genetic code of unambiguity, each codon specifies only one amino acid.