General Genetics/Print version


A section of DNA, the sequence of the plate-like units (nucleotides) in the center carries information.

Welcome to Genetics - the study of heredity. More precisely, it is the study of how living organisms inherit characteristics or traits from their ancestors.

It has only been in the past one hundred years that we have begun to understand how information in the form of physical characteristics is transferred from parent to child.

In genetics, a feature of an organism is called a trait. Some traits are features of an organism's physical appearance, for example, a person's eye-color, height or weight. There are many other types of traits and these range from aspects of behavior to resistance to disease. Traits are often inherited, for example tall and thin people tend to have tall and thin children.

Other traits come from the interaction between inherited features and the environment. For example a child might inherit the tendency to be tall, but if there is very little food where they live and they are poorly nourished, they will still be short. The way genetics and environment interact to produce a trait can be complicated: for example, the chances of somebody dying of cancer or heart disease seems to depend on both their family history and their lifestyle. Genetics will help to examine the dynamic between the often asked "Nature vs. Nurture" question.

Genetic information is carried by a long molecule called deoxyribonucleic acid, DNA for short. DNA consists of two long chains of nucleotides twisted into a double helix and joined by hydrogen bonds between the complementary bases adenine and thymine or cytosine and guanine that encodes the genetic information which is copied and inherited across generations.

Traits are carried in DNA as instructions for constructing and operating an organism. These instructions are contained in segments of DNA called genes. DNA is made of a sequence of simple units, with the order of these units spelling out instructions in the genetic code. This is similar to the orders of letters spelling out words. The organism "reads" the sequence of these units and decodes the instruction.

Not all the genes for a particular instruction are exactly the same. Different forms of one type of gene are called different alleles of that gene. As an example, one allele of a gene for hair color could carry the instruction to produce a lot of the pigment in black hair, while a different allele could give a garbled version of this instruction, so that no pigment is produced and the hair is white.

Mutations are either random or induced events that alter the sequence of a gene and can produce new or different traits. There are some exceptions to this, which will be discussed later. A new trait could be turning an allele for black hair into an allele for white hair.The appearance of new traits is important in evolution.

The Central Dogma

Basics of Heredity

Heredity is basically the passage of traits from a P ( parental) generation onto its offspring. Genetics is intimately related to reproduction and a knowledge of reproduction will facilitate your learning of genetics.


Reproduction is continuation of life. What reproduction does is transmit the hereditary information found in all cells of an organism, namely DNA. There are 2 types of reproduction, asexual and sexual. Asexual reproduction is the propagation of a single organism. That organism recreates itself and all its offspring are identical to it. Sexual reproduction promotes change and requires 2 organism to fuse gametes ( reproductive cells with genetic information) to create offspring that are different from their parents.

Genetic information is packaged into a unit known as a chromosome. A chromosome is made up mostly of DNA and proteins that make up its structure. Chromosome is a very general term. Humans have X shaped chromosomes, e. coli has a circular chromosome known as a plasmid. From these chromosomes and through DNA cells can construct an entire organism. It is a very efficient and elegant system that oversees the life of all organisms from tiny cells to humans.

In sexual reproduction, 2 organisms fuse gametes to make up the genetic information for their offspring. Examples of gametes are sperm and eggs. Gametes each contain half the genetic information their parents had. This means they are haploid ( haploid - half ploid) is represented by 1n. To be useful you need 2 gametes to fuse and a make a diploid ( di, two ploid) zygote ( 2n). From the zygote will be born the organism.


Chromosomes are the carriers of genetic information. Most mammalian organisms have an arrangement of diploid chromosomes. They exist in pairs, carrying homologous genetic information. That is they carry different versions of the same thing. Let's say there was a plant with only 1 pair of chromosomes. One chromosome has the allele for blue stem color and the other chromosome has an allele for green stem color (an allele is a subunit of a chromosome, its an alternate version of a gene -in this case stem color). In the end, only one of these alleles are expressed ( more on this later) but they are both present in this organism.

Genetic Principles

How a living organism is built and functions is determined and governed by genes. Understanding how genes work may enable researchers to:

  • detect and cure genetic illnesses.
  • determine an organism's features and behaviors.
  • create a new organism.

The relationships between genes and features are very complex. Currently, we are unsure if it will ever be possible to change only one feature of an organism by changing one or a set of genes. If features are able to be genetically altered, there is a likelihood that any attempt may accompany other changes. The other changes could be small and/or ignorable however. Some changes are gradual, while changes in gene expression can result in rapid transformations in the physiological state of an organism.

Gregor Mendel and Peas

Gregor Mendel was an Austrian monk. He is credited as the father of modern genetics. While planting and harvesting pea plants on his monastery he noticed patterns of traits in pea plants. Most pea plants turned out to have green pods, some had yellow pods. Some had yellow seeds and while others had green seeds. Stem length, petal color, pod shape, location of flowers all these seemed to exhibit a pattern of inheritance.

Mendel went on to breed pea plants to see how these traits acted. Through these experiments Mendel created 3 laws that govern how traits are passed on from parents to offspring.

Mendel's Laws

Law of Dominance - In a cross between contrasting traits only 1 appears in the F1 generation this is the dominant trait; the other is recessive

Law of Segregation- During gamete formation the 2 traits responsible for each trait separate so each gamete has only 1 gene for each trait

Law of Independent Assortment- when dihybrid plants are crossed the factors for 1 trait are distributed separately from other traits so that one can find all these changes

Chromosomes, alleles and Mendel’s law: the behavior of homologous chromosomes during meiosis can account for the segregation of the alleles at each genetic locus to different gametes. The alleles for 2 or more genes located on different chromosomes. In Mendel’s experiment, the segregation and the independent assortment during meiosis in the F1 generation give rise to the F2 phenotype ratio observed by Mendel.

Law of Dominance

It is when crossing two contracting character traits, one of the character traits will be masked of or will not show which is recessive and the other will be dominant.When Mendel crossed breeds two pea plant with contracting height,one being tall(TT) and the other being short or dwarf(tt). What he found is that all the the first generation(F1) where all tall.

Dominant and Recessive Genes

When Gregor Mendel studied pea plants, he studied one trait (height, flower color, etc.) at a time. As he studied different traits, he found that some versions of a trait tended to dominate over others. For example, he found that some plants produced wrinkled peas (r) instead of the round peas (R) typically observed, eventually attributed to a mutation affecting starch production. But not every plant that has the mutation to produce wrinkled peas. Because many organisms are diploid, which means they have two homologous chromosomes in each set, they have two copies of a locus (one on each chromosome). If one chromosome has the starch mutation (r) but the other chromosome doesn't (R), then the peas produce starch normally and have the rounded shape. In Mendelian genetics, this means that R is dominant over r.

Each individual pea plant has three possible genotypes: "RR," "Rr," and "rr."

This is a case of complete dominance, where the recessive trait is not seen as long as the organism has at least one copy of the dominant gene. This means that there are only two phenotypes: in this case, "round peas" or "smooth peas." Plants with either RR or Rr will produce round peas, whereas only rr plants will produce smooth peas.

The Molecular Basis of Dominance

We have said that organisms have "pairs" or "sets" of chromosomes, but what does that mean? To explain this point, let's consider humans, which have 23 pairs of chromosomes (for a total of 46 chromosomes). Within each pair, one chromosome was inherited from the maternal parent and the other from the paternal parent. When we say these chromosomes are "homologous," it does not mean they are genetically identical -- after all, most people do not have genetically identical parents. For chromosomes to be homologous, it simply means that the DNA sequences on each chromosome serve the same function, if not in the exact same way.

Any specific location on a chromosome can be referred to as a "locus" (pl: loci). Homologous chromosomes have the same loci, but they do not necessarily have the same DNA sequence at a given locus. Each unique variation of DNA sequence that can be found at a locus is an allele. In cases of complete dominance, the recessive allele is often nonfunctional. What makes Mendel's round and wrinkled peas a case of complete dominance is that pea plants can produce starch normally as long as they have at least one copy of the allele for normal starch formation.

Alternative Patterns of Inheritance

Not all loci show this simple dominance. If we represent phenotype on a plot then Complete Dominance would be like this:

AA/Aa                                          aa     Complete Dominance

Other types are:

AA                    Aa                       aa     No Dominance
AA     Aa                                      aa     Incomplete Dominance
Aa      AA                                     aa     Over Dominance

The Arbitrary Nature of Recessiveness and Dominance

Mendelian Inheritance

Gregor Johann Mendel was a monk in the Augustinian Monastery in the Brunn, Czech Republic. In 1854 he began the experiments which started modern genetics. His work with garden peas, Pisum sativum, was vital to our understanding of inheritance. He is known as the Father Of Genetics.

Mendel's Experiment

Mendel's first step was breeding pure breeding strains of peas. The traits he studied included:

  • Pea colour
  • Height of pea plants
  • and whether the Peas were wrinkled or smooth.

Mendel crossed the pure breeding Parental Generation (designated P). He found that the first generation (F1) was exclusively phenotypically one of the parental types. Mendel then crossed his F1 generation with itself. He found that the F2 generation showed a surprising trait, three quarters were like the F1 generation, while the remaining quarter were like the other Parents.

From this Mendel realised that there were two versions of each loci, one of which expressed dominance over the other. He called this Biparticulate Inheritance. If a gene was following this 3:1 pattern it was said to be segregating Normally.

By looking at multiple genes, Mendel showed that they were not linked to each other and that each loci he studied had no influence over the others. He called this Independent assortment.

By studying cases where the Mendelian laws we can also learn a lot. For instance, if a gene isn't segregating normally it may be sex linked. If two genes aren't Assorting Independently they're probably on the same chromosome

Mendel's laws are the first step to understanding Genetics, they lay down the basic concept of inheritance.


In biology the genome of an organism is the whole hereditary information of an organism that is encoded in the DNA (or, for some viruses, RNA). This includes both the genes and the non-coding sequences. The term was coined in 1920 by Hans Winkler, Professor of Botany at the University of Hamburg, Germany, as a portmanteau of the words gene and chromosome.

More precisely, the genome of an organism is a complete DNA sequence of one set of chromosomes; for example, one of the two sets that a diploid individual carries in every somatic cell. The term genome can be applied specifically to mean the complete set of nuclear DNA (i.e., the nuclear genome) but can also be applied to organelles that contain their own DNA, as with the mitochondrial genome or the chloroplast genome. When people say that the genome of a sexually reproducing species has been "sequenced," typically they are referring to a determination of the sequences of one set of autosomes and one of each type of sex chromosome, which together represent both of the possible sexes. Even in species that exist in only one sex, what is described as "a genome sequence" may be a composite from the chromosomes of various individuals. In general use, the phrase genetic makeup is sometimes used conversationally to mean the genome of a particular individual or organism. The study of the global properties of genomes of related organisms is usually referred to as genomics, which distinguishes it from genetics which generally studies the properties of single genes or groups of genes.


DNA sequences are incredibly long, but they take up very little space within a cell. This feat is accomplished through tightly-controlled DNA condensation.

The first step of DNA condensation involves a class of proteins called histones. DNA molecules are negatively charged due to their sugar-phosphate backbones, and histones are positively charged due to their high concentration of positively-charged amino acids (lysine, arginine, and histidine). Due to their complementary charges, the histones act as spools around which the DNA strands wind. This forms a structure described as beads-on-a-string that is then further condensed to form chromosomes. Because of this, the chromosomal material is referred to as chromatin: not DNA alone, but a combination of DNA and protein.

The DNA Molecule

genetics: The DNA Molecule

The DNA helix

DNA stands for deoxyribose nucleic acid, or deoxyribonucleic acid.

Its structure was discovered by James D. Watson and Francis H.C. Crick in 1953 with the assistance of Rosalind Franklin, but the knowledge of DNA was first discovered in 1871.

The DNA molecule has a polymer backbone of deoxyribose molecules (Ribose in RNA), a five carbon sugar, connected together by a phosphate group (see phosphorylation).

The sugar also connects to a nucleobase (or simply called "base", for short). There are 5 different bases used for coding, 4 of which are used in DNA (the other, uracil, is exclusive to RNA).

The four bases are adenine, guanine, cytosine and thymine. These are represented by the letters A, G, C, & T and carry all information found in the DNA (see nucleic acid nomenclature).

The structure of DNA is a double helix. This means that there are two strands coiled around each other. The molecule is bonded together by the bases with hydrogen bonds. Guanine pairs with Cytosine by three hydrogen bonds while Adenine bonds with Thymine by two hydrogen bonds (see base pair).

Watson and Crick's insight into the double helical structure of the DNA molecule was based upon Erwin Chargaff noting that these pairs of bases were always in the same concentration.

Structure of the DNA Molecule


DNA is generally found as a double helix, composed of two chains, or strands, of nucleotides held together by hydrogen bonds. A good analogy to this would be a spiral staircase, with the sides of the staircase being the strands, and the steps being the hydrogen bonds.


Nucleotides consist of three parts: sugar, nitrogenous bases, and a monophosphate group. The sugar in DNA is the five-carbon aldose deoxyribose, which has two nucleophilic hydroxyl groups: one at the 5' carbon and the other at the 3' carbon. In a nucleotide, the hydroxyl group at the 5' carbon is replaced with a monophosphate group (PO43-). Nucleotides are joined by bonds between the phosphate group at the 5' carbon on one nucleotide and the hydroxyl group at the 3' carbon on the other nucleotide. These bonds are known as phosphodiesterase bonds, and the string of sugar monomers joined by phosphodiesterase bonds is typically referred to as the sugar-phosphate backbone.

At the 2' carbon, each sugar molecule is joined to a nitrogenous base. There are four nitrogenous bases found in DNA: guanine (G), cytosine (C), thymine (T), and adenine (A). The purines A and G are composed of two rings, whereas the pyrimidines C and T are composed of one ring.

Base Pairing

The two DNA strands in a double helix are held together by hydrogen bonds between pairs of nitrogenous bases. However, not any two nitrogenous bases can form hydrogen bonds. Two purines are are too big to fit in the space between the two strands, whereas two pyrimidines would be too far apart for hydrogen bonds to form. One purine and one pyrimidine are the right size to fit between the strands and have hydrogen bonds form. Adenine and thymine each have one hydrogen bond donor and one hydrogen bond receptor, so two hydrogen bonds form between them. Guanine has two hydrogen bond donors and one hydrogen bond receptor, whereas cytosine has two hydrogen bond receptors and one hydrogen bond donor, so three hydrogen bonds form between them. The relative numbers of hydrogen bonds means that less energy is required to split DNA strands at A-T sites than at C-G sites.

Purines Pyrimidines
Adenine Thymine
2 H-Bonds    
Guanine Cytosine
3 H-Bonds    

Gene Expression

Gene expression is basically the control of the cell over which genes to make and which not to make at a specific time. Now why would a cell want to do this? The example we'll use to explore prokaryotic gene expression is the lac operon. The lac operon controls the transcription of genes to make an enzyme to catalyze lactose. It turns lactose into glucose and galactose, the glucose is used in cell respiration to make energy for the cell. The cell takes lactose from the environment and catalyzes it. But if there is already glucose in the environment whats the point of taking lactose when it requires an extra step to make it useful? The cell takes actions to conserve energy and protein by not making the enzyme required for lactose metabolism.


Introns are intervening RNA sequences that do not code for proteins or serve other functions. They lie between coding regions of RNA known as exons. During eukaryotic mRNA processing, the introns are excised and exons are spliced together to form the mature mRNA transcript. One pre-mRNA can give rise to different mature mRNA transcripts through alternative or differential splicing, wherein the exons are spliced in different permutations.

How Self-Splicing led to the RNA World Hypothesis

In the protozoan Tetrahymena, a rare form of "self-splicing" takes place wherein no proteins are required. The RNA folds into a secondary structure which excises introns. This was discovered by Tom Cech, who won the Nobel Prize in 1989 for his discovery of catalytic RNA.

The discovery of catalytic RNA led to the RNA World Hypothesis, which postulated that earlier forms of life may have relied solely on RNA to store genetic information and catalyze chemical reactions. However, many scientists believe that RNA is too unstable (due to its highly reactive 2'-OH group) to reliably give rise to life.

The Lariat Model of Intron/Exon Splicing

Small nuclear ribonucleoproteins (snRNPs) are formed from a combination of small nuclear RNA (snRNA) and helper proteins. These join with other proteins to form a spliceosome assembly.

Pre-mRNA has guanine (G) nucleotides at the 5' side of its introns. Towards the middle of the intron is an adenine (A) nucleotide. The spliceosome will fold the intron such that the G and A are next to each other. The oxygen of the A's 2' hydroxyl group attacks the 5' phosphate of the G. The G detaches from the 3' end of the exon to form a loop with the A. The hydroxyl group at the 3' end of the newly-formed exon attacks the phosphate group at the beginning of the next exon. The intron lariat is completely excised from the mRNA.


The Mechanism of Polyadenylation

There are a number of factors that must bind to the pre-mRNA for 3' cleavage and polyadenylation. Poly A polymerase (PAP) catalyzes poly A synthesis. Uniquely amongst polymerases, it does not require a template for polymerization. Cleavage and Polyadenylation Specificity Factor (CPSF) binds to the CPSF binding site (characteristic sequence: AAUAAA). PAP attaches to CPSF. Cleavage Stimulation Factor (CstF) binds the CstF binding site (characteristic sequence: GUNU, where N can stand for any nitrogenous base). The pre-mRNA folds so that CPSF and CstF can interact. Cleavage Factor I and II (CFI/II) bind to the mRNA between the CPSF and CstF binding sites. Cleavage occurs at this site. Poly A binding proteins bind to polyadenylated regions after PAP synthesizes them.

Functions of Polyadenylation

Addition of the poly-A tail facilitates translation and transportation of mRNA out of the nucleus.

Recombinant DNA Cloning Technology


This term refers to the process of transferring DNA fragment of interests from one organism to a self-replicating genetic element e.g. a bacterial plasmid. The DNA of interest can then be propagated in a foreign host cell. This technology has been instigated from the 1970s, since then it has become widely used and is a common practice in molecular biology labs to this very day.


Transposition is the integration of transposable elements into the genome. Transposable elements are DNA segments that jump around the genome and integrate themselves into different regions.


The first description of mobile genetic elements in a genome was made by Barbara McClintock working at Cold Spring Harbor in the 1950s. While attempting to explain the odd phenotypic behavior of mosaic color striations on corn kernels, she came to the conclusion that there were genetic elements in corn that could move among the chromosomes. Although her experimental support was strong, her conclusions was so far from the mainstream understanding of the nature of chromosomes, that she was politely ignored. In the late 1970s the discovery of bacterial transposons directed renewed attention on her pioneering work, and her efforts were resoundingly accepted when she was awarded an unshared Nobel prize in 1983.

Uses in Genetics

Transposable elements are very useful in studying the genome. They allow researchers to search for genes and enhancers and find interesting relationships between phenotypes and genotypes. By using p-elements and transposase, DNA constructs can be formed to randomly jump around the genome. P-elements flank the DNA sequence you want to jump around and transposase is used to cut the squence out and reinsert it elsewhere.

On method for using transposable elements to find various enhancer sites is to construct a transposable element that contains the Gal-4 gene with a weak promoter with a p-element upstream and downstream. Another construct with UAS site (when the gal-4 protein binds to the UAS site, anything downstream is expressed) and a marker gene downstream that can, for example, encode for a fluorescent protein is also created. By inserting these constructs into 2 strains of flies, you will have 1 strain with the transposable Gal-4 gene and another strain with the stationary UAS construct. A third mouse strain containing transposase gene is needed.

Strain 1: Transposable Gal-4 strain

Strain 2: UAS marker Strain

Strain 3: Transposase Strain

First, strains 1 and 3 are crossed to produce flies with both the p-elements and the transposase. This allows the construct with gal-4 to jump around to random locations in the genome and depending on the location of gal-4, the amount of gal-4 expression with be changed. Afterwards, the gal-4 construct is stabilized by crossing it with normal flies and genotyped to find the flies with only the gal-4 gene. These flies are then crossed brother to sister to produce specific lines of homozygous gal-4 mutants. These new lines are then crossed with the UAS marker strain 2 and the effects on phenotype are observed. If for example a gal-4 construct lands next to a tissue specific enhancer for the eye, the gal-4 protein will bind to the UAS site and the marker gene will be expressed causing, for example, the tissue color to be green. Using probes for the known gal-4 sequence, the region of DNA is isolated and sequenced to find the enhancer site. This method can be used to find various enhancers that can be used in other experiments.

Developmental Genetics

Every cell in a multicellular organism contains the same identical genome - yet in a human being there are eye cells, bone cells, and brain cells - which all serve different purposes yet genetically are identical. How can this be?

There are two key differences between cells that we call differentiated.

  1. Cells differentiate through a complex system that involves the establishment of positional information and cell differentiation through paracrine signaling and other processes.
  2. The genome interacts with proteins which enable or disable certain genes. Due to the initial developmental processes, a different set of proteins is maintained in a given cell, allowing for an identical genome but with entirely different functions, and proteins.

When development begins, positional information must be established.

Once positional information is established, cells begin to change due to differences in gene transcription among the developing zygote. Some cells become intermediates - cells that would never be found in an adult organism but nonetheless act as a starting point for a number of other cells. These are known as pluripotent cells.

There are 204 different cell types in Human beings - so not every cell can be "unique". When development occurs, pluripotent cells divide and through signalling prescribe a cell fate to a set of cells, known as a developmental field. The end result is a large number of grouped cells all develop identically, into a muscle for example, while a set somewhere else will develop into something else altogether.

Population Genetics

Population Genetics is the field of genetics which studies allele distributions and genetic variation in populations. Population geneticists study the processes of mutation, migration, natural selection and genetic drift on populations, and in doing so are studying evolution as it occurs.


Editor's note
Chapter or book approach: The approach of this chapter / book is to work through a series of models. The first model will be the Hardy-Weinberg Model, and then progressively, the models will move away from the premises of Hardy-Weinberg.

Foundations of Population Genetics

Templeton states that the three premises of population genetics are the same premises for population genetics:

  1. Templates/DNA can replicate
  2. Templates/DNA can mutate and recombine
  3. Phenotypes emerge from the interaction of templates/DNA and environment.

Replication of Populations

There are three major properties that a population must maintain with replication:

  1. They are composed of reproducing individuals
  2. They are distributed over space and time
  3. They host a population of genes

The first property indicates that individuals of the population must reproduce to keep the population stable. This is necessary because individuals breakdown over time do to the introduction of entropy and inability for an individual to continuously remove the entropy added by the environment. Thus to maintain the population, individuals must pass down their DNA, or organizational encoding, to the next generation. Through continuous reproduction, a population can be maintained over a much longer time than the individuals that comprise it. In addition, the continuous reproduction of over time enables for the population to have properties and components of its own.

The second property is that a population is distributed over a space. Populations can exists as:

  • small isolated groups
  • a collection of groups with a varied amount of genetic exchange
  • a large interbreeding population that exists over a vast space

In general though the population can be divided into a primary group that can be considered as interbreeding and a secondary group that mates occasionally with the primary group. It is this primary group that population geneticists generally study, as it is generally stable. They define the group as a group of interbreeding individuals that share a common system of mating. The secondary group is generally ignored, and treated as noise in the system, unless it is having a major effect on the primary group.

The third property that must be maintained with reproduction is the population's gene pool. The gene pool is the collection of all the genes, organizational templates, in the population that can be used to create new individuals. By studying this gene pool, geneticists can determine the frequency of alleles, and or groups of alleles in the population and how they are changing over time. From the patterns of result that are obtained, geneticists then can start to understand what forces are acting on the population.

Template Mutations

Change is a requirement of evolution and one method of introducing change is through modification of the templates used sustain the population. In the case of living life, these templates are genes.

Sources of Mutation:

  • Insertions
  • Deletions
  • Single Nucleotide Substitutions (sometimes changing the protein sequences and sometimes not)
  • Transpositions
  • Duplications

An allele is an alternate form of an template. In the case of biological systems, an allele is a form of a gene. Zooming further out, a version of a region of templates is called a haplotype. Biological systems would call this a sequence of nucleotides, while a in a computer system, this would be a sequence of linked objects.

Modeling Evolution

Initially we are going to consider populations with genetic architectures of two loci per template.

Hardy-Weinberg Model

The genetic architecture of the Hardy Weinberg model is one locus, two allele model (Templeton, p. 35).

Hardy-Weinberg Equilibrium

If a population has no forces of evolution acting upon it is in Hardy Weinberg Equilibrium. Quantitatively it says that if the allele proportions of two alleles A and a are denoted p and q then the genotype proportions will be such that the homozygote AA will be of proportion p2, the heterozygote Aa will have proportion 2pq and the homozygote aa will be of proportion q2.

Testing for Hardy-Weinberg

To do this test you do what is known as a Chi Square test ( ). Where:


and the degrees of freedom is no. of Genotypes - no. of Alleles if χ 2 < table value then we accept that our population is in Hardy-Weinberg.

...note that we don't actually take a square, it is there out of tradition.

Concise Chisquare Table

Degrees of Freedom 1 2 3
χ 2 Value 3.84 5.99 7.83


If there is a population with genotype proportions AA: 0.1 Aa: 0.4 aa: 0.5

So the allele proportions are  


So our expected values are AA: 0.09 Aa: 0.42 aa: 0.49.


Therefore we can accept the hypothesis is in Hardy-Weinberg Equilibrium and that there are no forces of equilibrium on these alleles.

Two Autosomal Loci, Two Allele Model

Extending the H.W. model to two autosomal locus model with two alleles.

For the purpose of this discussion, the first locus will have alleles A and a and the second locus will have B and b. From this we can get the following gamete types and their frequencies through recombination:

Gamate Frequency
Ab FreqAb
aB FreqaB
ab Freqab
Sum 1

A population producing the above four gamates can produce the following genotypes:

Gamates AB Ab aB ab
AB FreqAB . FreqAB FreqAB . FreqAb FreqAB . FreqaB FreqAB . Freqab
Ab FreqAb . FreqAB FreqAb . FreqAb FreqAb . FreqaB FreqAb . Freqab
aB FreqaB . FreqAB FreqaB . FreqAb FreqaB . FreqaB FreqaB . Freqab
ab Freqab . FreqAB Freqab . FreqAb Freqab . FreqaB Freqab . Freqab

If you analyze the table above, it can be noticed that there are only ten unique combinations. Four combinations correspond to homozygous zygotes and the remaining six are the heterozygous zygotes.


Notice that the sum of FreqAB, FreqAb, FreqaB, and Freqab is one. This follows from the earlier model where the sum of p and q equaled one.


(this section is used to discuss how we get term r)

Homozygous Case

AB/AB or ab/ab or aB/aB or Ab/Ab

Single Heterozygous Case

AB/Ab or AB/aB aB/ab or Ab/ab

Double Heterozygous Case

AB/ab or Ab/aB

Note that recombination is only noticeable in double heterozygotes.

Linkage Disequilibrium

(this section is used to discuss how we get term D)

Linkage disequilibrium measure,  

Formally, if we define pairwise LD, we consider indicator variables on alleles at two loci, say  . We define the LD parameter   as:


Here   denote the marginal allele frequencies at the two loci and   denotes the haplotype frequency in the joint distribution of both alleles. Various derivatives of this parameter have been developed. In the genetic literature the wording "two alleles are in LD" usually means to imply  . Contrariwise, linkage equilibrium, denotes the case  .

Pulling Information from the Model

Is Evolution Occurring

If r > 0 and D != 0 then evolution is occurring

If r = 0 or if D equals 0, then no evolution is occurring

Mutation and Disequilibrium - Normalized Linkage Disequilibrium

Mating Systems


To do:
To be created.



To do:
To be created.

Absolute Fitness

Relative Fitness

Frequency before selection

Frequency after selection

Genetic Drift


Editor's note
This section is in a total flux right now and is being outlined.

First experiments were done in the early 30's. German was a major scientific language before the second world war.

Genetic drift is a function of population size N.

The effects of genetic drift is inversely proportional to population size. This means as the population increases, the deviation from expected allelic frequencies will decrease. (Templeton, p. 84).

Genetic drift is non-directional.

There is no attraction to return ancestral allele frequencies.

Genetic drift is a cumulative function. Changes in allele frequencies from the previous generations are added to the changes that occur in the current generation.

Genetic drift is only occurs when there is variability.

To study genetic drift this section will create a simplified model and expand upon it. The model will just have two states, a) genetic drift is occurring and b) genetic drift is not occurring.

a->b fixation b->a mutations causing differentiable patterns b->a reintroduction of differentiable patterns

Genetic drift can be broken down into two simplified states to start. The f


In each population, 2N=32, po=qo=0.5 initially and then did 19 generations of data. (Buri, P. 1956 "Gene frequency in small populations of mutant Drosophila. Evolution 10: 367-402

(See Figure 6.3 from Hedrick, P.W. 2005, Genetics of populations, 3rd edition. Jones and Bartlett, Sudbury, MA)

Note that in this figure, variance is increasing, but mean allele frequency over populations is staying relatively the same.

Simulation of Genetic Drift

Genetic Drift can be simulated using a Monte Carlo Simulation. (See Figure 6.2 in Hedrick)

The proportion of populations expected to go to fixation for a given allele is equal to the initial frequency of that allele

Only the allele frequencies are changing and the distribution of the allele frequency. The mean allele frequency over multiple replicate populations does not change due to genetic drift.

Heterzygositity or the variance of the allele frequency over the replicate population can be used to understand genetic drift.

We can calculate the number of generations necessary to reduce the reduce the heterozygostity

t=ln(x)*-2*N, where x is how much heterozgostity is left, and N is the population size.

Coalescence Theory


To do:
To be created.

Furthur Reading in Wikipedia


Gene Pool - p. 37 and p. 38 of Templeton's book


Hardy Weinberg

Example 1

Homozygotes Heterozygotes Sum
Observed 85 15 100
Expected 60 40 100

The probability of observing these numbers is:



Example 2

Genotype Count
Homodom 20
Hetro 50
Homorec 30

What are the observed frequencies of p and q?





What are the expected number of genotypes?


Do the observed genotypes fit H-W expectation?



If χ2 greater than degree of freedom, then the null hypothesis, in this case it fits H.W. can be rejected.

Example 3 - Next Generation

For an autosomal locus



Example 4 - Sex Linked (1 locus, 2 allele)





It is possible to solve for the amount of time it will take till an initial frequency is less than or equal to a particular threshold by using the following formula:



Example 5 - Inbreeding - Self Pollination

Genotype Count
Homodom 20
Hetro 50
Homorec 30

Calculate F Value






Chi-squared then can be calculated by:


Example 6 - Inbreeding (1 Locus / 2 alleles)

Given the frequency of the dominant allele and an F value:

Allele Frequency
p 0.6
q 0.4



What sample size would be necessary to detect an effect of F=0.01 at the S% significant level?

  1. Look up the X2 value for S% significant level (usually S = 5%)
  2. Use the following formula:


Example 7 - Null Allele

Genotype Count
Homodom 20
Hetro 50
Homorec 30








Selection Part 1

Example 1

Genotypes Homodom Heterozygotes Homorec
Viability 0.7 1 0.6
Fertility 11 6 3

Genotypes Homodom Heterozygotes Homorec
Viability (v) 0.7 1 0.6
Fertility (f) 11 6 3
Absolute Fitness (W)      
Relative Fitness (w)      

What are the selection coefficients?




Example 2 - Mean Fitness Calculation

Genotypes Homodom Heterozygotes Homorec
Freq 0.3 0.6 0.1
Rel. Fitness 1 0.6 0.1

What is the mean fitness of this population?


Example 3 - Natural Selection Analysis

Genotypes Homodom Heterozygotes Homorec
Freq 0.3 0.6 0.1
Rel. Fitness 1 0.6 0.1

What is the mean fitness of this population?










Selection Part 2

Example 1

Homodom Heterozygotes Homorec
Relative Fitnessfemales 0.6 0.8 1
Relative Fitnessmales 1 0.8 0.6

What are the equilibrium frequencies for the pf and qf alleles in females considering that heterozygotes have intermediate fitness?





What are the equilibrium frequencies for the pm and qm alleles in males considering that heterozygotes have intermediate fitness?



What range of sf values will give a stable polymorphism?


What range of sm values will give a stable polymorphism?


Example 2 - Complete Dominance w/ Frequency Dependent Selection

Homodom Heterozygotes Homorec
Relative Fitness (w) 1.3 1.3 1.7

What are the selection coefficients shomo, dom and shetero with q = 0.02?







Calculate the stable equilibrium frequency, qe and pe.



Calculate the mean fitness assuming HW equilibrium and using:

(a) current allele frequencies



(b) stable equilibrium allele frequencies



(c) general formula


Example 3 - Self Incompatibility Locus

Given initial frequencies:

(a) What are the frequencies of the S1, S2, and S3 alleles initially, and after one and two generations?




(b) After one generation?







(b) After two generations?







Example 4 - Fitness / Overdominance

Homodom Heterozygotes Homorec
Relative Fitness 0.4 1.0 0.8

To find the equilibrium of p:






Example 5 - Selection and Viability

Genotypes Homodom Heterozygotes Homorec
Zygotic Frequencies 0.05 0.33 0.62
Viabilities 1 0.6 0.59

Genotypes Homodom Heterozygotes Homorec
Zygotic Frequencies (Z) 0.05 0.33 0.62
Viabilities (V) 1 0.6 0.59
Adult Freq After Selection      






Predicted Genotype Frequencies




Example 6

Homodom Heterozygotes Homorec
Observed Numbers 750 1000 250

What are the absolute and relative fitness for each?

Homodom Heterozygotes Homorec
Observed Numbers 700 1000 200
Predicted Numbers 500 1000 500








Behavioral Genetics

General Genetics/Behavioral Genetics

Quantitative Genetics

General Genetics/Quantitative Genetics

Mutations and Evolution

Its often not grandiose, overlapping theories such as the Neutral Theory of Evolution, or Modern Synthesis, that biologists work with in their research on a day-to-day basis. Instead, it is the beautiful basics of genetics that are the cornerstone of experiments. The basics are then used to reflect on bigger ideas, such as population ecology and evolution. It is the beautiful basics of genetics with which we concern ourselves in this section.

Every organism has a set of genes and half of the genes of that organism come from each parent. The combinations of the genes causes the variation of individuals within the species. The genes of a butterfly, an ape or a fowl carry the code that determines the appearance and the character of the butterfly, the ape and the fowl. The genetic code allows an overwhelming variety within the species of the kind. A mutation is basically a gene that has an abnormality in relation to its normal configuration. The abnormality can then be passed to successive offspring, thereby producing a marked difference. There are many different types of mutations. The smallest possible genetic mutation is a 'point-mutation'. This occurs in the DNA when the base-pairs combine with the 'wrong' partner. Multiple point-mutations are common and are found to increase substantially by the effect of mutagens. Mutations are, of course, heritable and these can extend to whole or part chromosomal mutations. Because many genes are affected by a chromosomal mutation, these often have drastic ramifications on the offspring.

Ethical Issues

There many ethical issues with cloning, and genetics in general.

In the United states, many supermarkets are already selling genetically modified (GM) foods. Most of the produce usually creates its own pesticide that will eat the insects from the inside out, a trait borrowed from a group of bacteria that, to our knowledge, has no adverse affects on humans. The ethical issue comes into play here: most supermarkets do not advertise these foods as Genetically Modified. Even though no adverse affects are known to date, not allowing the customer to know that they are purchasing GM foods is controversial. There are two main sides of the argument:

  1. GM foods won't harm you, so why tell the people? If they have adverse affects once you tell them, it could be a self-fulfilling prophecy.
  2. Even if it won't hurt us, you should still tell us so we have a choice in the matter. That's like giving us a certain drug without telling us: it's not ethical.

If you go into your local supermarket and ask the General Manager if they know whether GM foods are being sold, the answer will probably be a resounding "No", since there is no separate signing for these foods. This reflects an underlying problem with the genetically-modified food discussion. Which foods are genetically modified? Many seedless fruits exist as a result of primitive genetic manipulation in the form of accelerated artificial selection. Products such as corn and apples could not exist without human intervention in the form of breeding and cross-species grafting. If we place the boundary at the laboratory door, we demonize certain foods that have saved billions of lives, specifically golden rice and others with which we have become perfectly comfortable.

The most commonly expressed concern with GM organisms is that their long term effect on the Earth's ecosystem is unknown. For a plant to evolve by natural selection took a long time and there was time for the ecosystem to adapt.

Cloning endangered species may be a good idea to keep the ecosystem in balance.