General Genetics/Print version


Introduction

A section of DNA, the sequence of the plate-like units (nucleotides) in the center carries information.

Welcome to Genetics - the study of heredity. More precisely, it is the study of how living organisms inherit characteristics or traits from their ancestors.

It has only been in the past one hundred years that we have begun to understand how information in the form of physical characteristics is transferred from parent to child.

In genetics, a feature of an organism is called a trait. Some traits are features of an organism's physical appearance, for example, a person's eye-color, height or weight. There are many other types of traits and these range from aspects of behavior to resistance to disease. Traits are often inherited, for example tall and thin people tend to have tall and thin children.

Other traits come from the interaction between inherited features and the environment. For example a child might inherit the tendency to be tall, but if there is very little food where they live and they are poorly nourished, they will still be short. The way genetics and environment interact to produce a trait can be complicated: for example, the chances of somebody dying of cancer or heart disease seems to depend on both their family history and their lifestyle. Genetics will help to examine the dynamic between the often asked "Nature vs. Nurture" question.

Genetic information is carried by a long molecule called deoxyribonucleic acid, DNA for short. DNA consists of two long chains of nucleotides twisted into a double helix and joined by hydrogen bonds between the complementary bases adenine and thymine or cytosine and guanine that encodes the genetic information which is copied and inherited across generations.

Traits are carried in DNA as instructions for constructing and operating an organism. These instructions are contained in segments of DNA called genes. DNA is made of a sequence of simple units, with the order of these units spelling out instructions in the genetic code. This is similar to the orders of letters spelling out words. The organism "reads" the sequence of these units and decodes the instruction.

Not all the genes for a particular instruction are exactly the same. Different forms of one type of gene are called different alleles of that gene. As an example, one allele of a gene for hair color could carry the instruction to produce a lot of the pigment in black hair, while a different allele could give a garbled version of this instruction, so that no pigment is produced and the hair is white.

Mutations are either random or induced events that alter the sequence of a gene and can produce new or different traits. There are some exceptions to this, which will be discussed later. A new trait could be turning an allele for black hair into an allele for white hair.The appearance of new traits is important in evolution.

The Central Dogma


Basics of Heredity

Heredity is basically the passage of traits from a P ( parental) generation onto its offspring. Genetics is intimately related to reproduction and a knowledge of reproduction will facilitate your learning of genetics.

Reproduction

Reproduction is continuation of life. What reproduction does is transmit the hereditary information found in all cells of an organism, namely DNA. There are 2 types of reproduction, asexual and sexual. Asexual reproduction is the propagation of a single organism. That organism recreates itself and all its offspring are identical to it. Sexual reproduction promotes change and requires 2 organism to fuse gametes ( reproductive cells with genetic information) to create offspring that are different from their parents.

Genetic information is packaged into a unit known as a chromosome. A chromosome is made up mostly of DNA and proteins that make up its structure. Chromosome is a very general term. Humans have X shaped chromosomes, e. coli has a circular chromosome known as a plasmid. From these chromosomes and through DNA cells can construct an entire organism. It is a very efficient and elegant system that oversees the life of all organisms from tiny cells to humans.

In sexual reproduction, 2 organisms fuse gametes to make up the genetic information for their offspring. Examples of gametes are sperm and eggs. Gametes each contain half the genetic information their parents had. This means they are haploid ( haploid - half ploid) is represented by 1n. To be useful you need 2 gametes to fuse and a make a diploid ( di, two ploid) zygote ( 2n). From the zygote will be born the organism.

Chromosomes

Chromosomes are the carriers of genetic information. Most mammalian organisms have an arrangement of diploid chromosomes. They exist in pairs, carrying homologous genetic information. That is they carry different versions of the same thing. Let's say there was a plant with only 1 pair of chromosomes. One chromosome has the allele for blue stem color and the other chromosome has an allele for green stem color (an allele is a subunit of a chromosome, its an alternate version of a gene -in this case stem color). In the end, only one of these alleles are expressed ( more on this later) but they are both present in this organism.


Genetic Principles

How a living organism is built and functions is determined and governed by genes. Understanding how genes work may enable researchers to:

  • detect and cure genetic illnesses.
  • determine an organism's features and behaviors.
  • create a new organism.

The relationships between genes and features are very complex. Currently, we are unsure if it will ever be possible to change only one feature of an organism by changing one or a set of genes. If features are able to be genetically altered, there is a likelihood that any attempt may accompany other changes. The other changes could be small and/or ignorable however. Some changes are gradual, while changes in gene expression can result in rapid transformations in the physiological state of an organism.

Gregor Mendel and Peas

Gregor Mendel was an Austrian monk. He is credited as the father of modern genetics. While planting and harvesting pea plants on his monastery he noticed patterns of traits in pea plants. Most pea plants turned out to have green pods, some had yellow pods. Some had yellow seeds and while others had green seeds. Stem length, petal color, pod shape, location of flowers all these seemed to exhibit a pattern of inheritance.

Mendel went on to breed pea plants to see how these traits acted. Through these experiments Mendel created 3 laws that govern how traits are passed on from parents to offspring.

Mendel's Laws

Law of Dominance - In a cross between contrasting traits only 1 appears in the F1 generation this is the dominant trait; the other is recessive

Law of Segregation- During gamete formation the 2 traits responsible for each trait separate so each gamete has only 1 gene for each trait

Law of Independent Assortment- when dihybrid plants are crossed the factors for 1 trait are distributed separately from other traits so that one can find all these changes

Chromosomes, alleles and Mendel’s law: the behavior of homologous chromosomes during meiosis can account for the segregation of the alleles at each genetic locus to different gametes. The alleles for 2 or more genes located on different chromosomes. In Mendel’s experiment, the segregation and the independent assortment during meiosis in the F1 generation give rise to the F2 phenotype ratio observed by Mendel.

Law of Dominance


Dominant and Recessive Genes

When Mendel studied peas, one of the phenotypes showed complete dominance over the other one. If we look at pea height, and denote the gene for short as s and the gene for tall as S, then as every plant has two sets chromosomes each has two genes at this locus.

So there are three possibilities: SS, Ss, ss (order doesn't matter)

In this case S was fully dominant over s, so Ss individuals were phenotypically identical to SS individuals. Only ss pea plants were short. The S gene would be said to be Dominant While the s gene is said to be Recessive.

The Molecular Basis of Dominance

As has already been mentioned, all diploid organisms have two homologous chromosomes. At a specific locus on each homologous chromosome, there are homologous alleles for a particular trait. For example, the gene that codes for a dominant tall pea plant could be labeled A2 and for a short recessive pea plant could be labeled A1.


Alternative Patterns of Inheritance

Not all loci show this simple dominance. If we represent phenotype on a plot then Complete Dominance would be like this:

AA/Aa                                          aa     Complete Dominance

Other types are:

AA                    Aa                       aa     No Dominance
AA     Aa                                      aa     Incomplete Dominance
Aa      AA                                     aa     Over Dominance

The Arbitrary Nature of Recessiveness and Dominance


Mendelian Inheritance

Gregor Johann Mendel was a monk in the Augustinian Monastery in the Brunn, Czech Republic. In 1854 he began the experiments which started modern genetics. His work with garden peas, Pisum sativum, was vital to our understanding of inheritance. He is known as the Father Of Genetics.

Mendel's Experiment

Mendel's first step was breeding pure breeding strains of peas. The traits he studied included:

  • Pea colour
  • Height of pea plants
  • and whether the Peas were wrinkled or smooth.

Mendel crossed the pure breeding Parental Generation (designated P). He found that the first generation (F1) was exclusively phenotypically one of the parental types. Mendel then crossed his F1 generation with itself. He found that the F2 generation showed a surprising trait, three quarters were like the F1 generation, while the remaining quarter were like the other Parents.

From this Mendel realised that there were two versions of each loci, one of which expressed dominance over the other. He called this Biparticulate Inheritance. If a gene was following this 3:1 pattern it was said to be segregating Normally.

By looking at multiple genes, Mendel showed that they were not linked to each other and that each loci he studied had no influence over the others. He called this Independent assortment.

By studying cases where the Mendelian laws we can also learn a lot. For instance, if a gene isn't segregating normally it may be sex linked. If two genes aren't Assorting Independently they're probably on the same chromosome

Mendel's laws are the first step to understanding Genetics, they lay down the basic concept of inheritance.


Genomes

In biology the genome of an organism is the whole hereditary information of an organism that is encoded in the DNA (or, for some viruses, RNA). This includes both the genes and the non-coding sequences. The term was coined in 1920 by Hans Winkler, Professor of Botany at the University of Hamburg, Germany, as a portmanteau of the words gene and chromosome.

More precisely, the genome of an organism is a complete DNA sequence of one set of chromosomes; for example, one of the two sets that a diploid individual carries in every somatic cell. The term genome can be applied specifically to mean the complete set of nuclear DNA (i.e., the nuclear genome) but can also be applied to organelles that contain their own DNA, as with the mitochondrial genome or the chloroplast genome. When people say that the genome of a sexually reproducing species has been "sequenced," typically they are referring to a determination of the sequences of one set of autosomes and one of each type of sex chromosome, which together represent both of the possible sexes. Even in species that exist in only one sex, what is described as "a genome sequence" may be a composite from the chromosomes of various individuals. In general use, the phrase genetic makeup is sometimes used conversationally to mean the genome of a particular individual or organism. The study of the global properties of genomes of related organisms is usually referred to as genomics, which distinguishes it from genetics which generally studies the properties of single genes or groups of genes.


Chromosomes

Chromosomes are packages of DNA formed in eukaryotes to organize the genetic information. They protect the DNA and are passed down from parents to offspring. Chromosomes are organized into pairs of homologs that contain alleles of the same gene.


The DNA Molecule

genetics: The DNA Molecule

The DNA helix

DNA stands for deoxyribose nucleic acid, or deoxyribonucleic acid.

Its structure was discovered by James D. Watson and Francis H.C. Crick in 1953 with the assistance of Rosalind Franklin, but the knowledge of DNA was first discovered in 1871.

The DNA molecule has a polymer backbone of deoxyribose molecules (Ribose in RNA), a five carbon sugar, connected together by a phosphate group (see phosphorylation).

The sugar also connects to a nucleobase (or simply called "base", for short). There are 5 different bases used for coding, 4 of which are used in DNA (the other, uracil, is exclusive to RNA).

The four bases are adenine, guanine, cytosine and thymine. These are represented by the letters A, G, C, & T and carry all information found in the DNA (see nucleic acid nomenclature).

The structure of DNA is a double helix. This means that there are two strands coiled around each other. The molecule is bonded together by the bases with hydrogen bonds. Guanine pairs with Cytosine by three hydrogen bonds while Adenine bonds with Thymine by two hydrogen bonds (see base pair).

Watson and Crick's insight into the double helical structure of the DNA molecule was based upon Erwin Chargaff noting that these pairs of bases were always in the same concentration.


Structure of the DNA Molecule

DNA Overview.png

DNA is generally found as a double helix, composed of two chains, or strands, of nucleotides held together by hydrogen bonds. A good analogy to this would be a spiral staircase, with the sides of the staircase being the strands, and the steps being the hydrogen bonds.

Nucleotides

As was said above, DNA is composed of chains of nucleotides. Each nucleotide consists of deoxyribose (a 5-carbon sugar), which is bonded to a phosphate group and one of four nitrogenous bases. The sugar and the phosphate make what is usually referred to as the "sugar-phosphate" backbones of the DNA molecule by binding to the sugar and phosphate groups of other nucleotides. The nitrogenous base, on the inside of the double helix, makes hydrogen bonds with the nitrogenous base on the opposite strand.

Base Pairing

The four nitrogenous bases found in DNA are Guanine, Cytosine, Thymine and Adenine, abbreviated as G, C, T and A, respectively. Adenine and Guanine, being composed of two rings, are known as purines, while Cytosine and Thymine, being composed as one ring, are known as pyrimidines.

Purines PyrimidinesPrint version
Adenine Thymine
2 H-Bonds Adenine.svg Thymine chemical structure.png
3 H-Bonds Guanine Cytosine
Guanine chemical structure.png Cytosine chemical structure.png


Gene Expression

Gene expression is basically the control of the cell over which genes to make and which not to make at a specific time. Now why would a cell want to do this? The example we'll use to explore prokaryotic gene expression is the lac operon. The lac operon controls the transcription of genes to make an enzyme to catalyze lactose. It turns lactose into glucose and galactose, the glucose is used in cell respiration to make energy for the cell. The cell takes lactose from the environment and catalyzes it. But if there is already glucose in the environment whats the point of taking lactose when it requires an extra step to make it useful? The cell takes actions to conserve energy and protein by not making the enzyme required for lactose metabolism.


Splicing


RNA splicing is the removal of introns or intervening sequences (parts that do not code for anything and lay between coding regions). The coding regions are known as exons (expressed sequences). At each end of an intron, there is short sequence that "small nuclear ribonucleoproteins" (snRNPs) recognize and together with other proteins that form an assembly known as a spliceosome, cut out introns and then joined the exons.


Note: Only eukaryotes contain introns in the precursor of messenger RNA (mRNA). prokaryotes, such as bacteria, do not.


Polyadenylation

Polyadenylation occurs in Eukaryotes to prevent RNA trasncript degradation. A number of enzymes are involved, which add hundreds to thousands of adenines to the 3' end of an mRNA transcript. Many of these adenines will be lost before translation - but enough are added to prevent degradation of the script prior to translation.


Recombinant DNA Cloning Technology

DNA CLONING

This term refers to the process of transferring DNA fragment of interests from one organism to a self-replicating genetic element e.g. a bacterial plasmid. The DNA of interest can then be propagated in a foreign host cell. This technology has been instigated from the 1970s, since then it has become widely used and is a common practice in molecular biology labs to this very day.


Transposition

Transposition is the integration of transposable elements into the genome. Transposable elements are DNA segments that jump around the genome and integrate themselves into different regions.

Discovery

The first description of mobile genetic elements in a genome was made by Barbara McClintock working at Cold Spring Harbor in the 1950s. While attempting to explain the odd phenotypic behavior of mosaic color striations on corn kernels, she came to the conclusion that there were genetic elements in corn that could move among the chromosomes. Although her experimental support was strong, her conclusions was so far from the mainstream understanding of the nature of chromosomes, that she was politely ignored. In the late 1970s the discovery of bacterial transposons directed renewed attention on her pioneering work, and her efforts were resoundingly accepted when she was awarded an unshared Nobel prize in 1983.

Uses in Genetics

Transposable elements are very useful in studying the genome. They allow researchers to search for genes and enhancers and find interesting relationships between phenotypes and genotypes. By using p-elements and transposase, DNA constructs can be formed to randomly jump around the genome. P-elements flank the DNA sequence you want to jump around and transposase is used to cut the squence out and reinsert it elsewhere.

On method for using transposable elements to find various enhancer sites is to construct a transposable element that contains the Gal-4 gene with a weak promoter with a p-element upstream and downstream. Another construct with UAS site (when the gal-4 protein binds to the UAS site, anything downstream is expressed) and a marker gene downstream that can, for example, encode for a fluorescent protein is also created. By inserting these constructs into 2 strains of flies, you will have 1 strain with the transposable Gal-4 gene and another strain with the stationary UAS construct. A third mouse strain containing transposase gene is needed.

Strain 1: Transposable Gal-4 strain

Strain 2: UAS marker Strain

Strain 3: Transposase Strain

First, strains 1 and 3 are crossed to produce flies with both the p-elements and the transposase. This allows the construct with gal-4 to jump around to random locations in the genome and depending on the location of gal-4, the amount of gal-4 expression with be changed. Afterwards, the gal-4 construct is stabilized by crossing it with normal flies and genotyped to find the flies with only the gal-4 gene. These flies are then crossed brother to sister to produce specific lines of homozygous gal-4 mutants. These new lines are then crossed with the UAS marker strain 2 and the effects on phenotype are observed. If for example a gal-4 construct lands next to a tissue specific enhancer for the eye, the gal-4 protein will bind to the UAS site and the marker gene will be expressed causing, for example, the tissue color to be green. Using probes for the known gal-4 sequence, the region of DNA is isolated and sequenced to find the enhancer site. This method can be used to find various enhancers that can be used in other experiments.


Developmental Genetics

Every cell in a multicellular organism contains the same identical genome - yet in a human being there are eye cells, bone cells, and brain cells - which all serve different purposes yet genetically are identical. How can this be?

There are two key differences between cells that we call differentiated.

  1. Cells differentiate through a complex system that involves the establishment of positional information and cell differentiation through paracrine signaling and other processes.
  2. The genome interacts with proteins which enable or disable certain genes. Due to the initial developmental processes, a different set of proteins is maintained in a given cell, allowing for an identical genome but with entirely different functions, and proteins.

When development begins, positional information must be established.

Once positional information is established, cells begin to change due to differences in gene transcription among the developing zygote. Some cells become intermediates - cells that would never be found in an adult organism but nonetheless act as a starting point for a number of other cells. These are known as pluripotent cells.

There are 204 different cell types in Human beings - so not every cell can be "unique". When development occurs, pluripotent cells divide and through signalling prescribe a cell fate to a set of cells, known as a developmental field. The end result is a large number of grouped cells all develop identically, into a muscle for example, while a set somewhere else will develop into something else altogether.


Population Genetics

Population Genetics is the field of genetics which studies allele distributions and genetic variation in populations. Population geneticists study the processes of mutation, migration, natural selection and genetic drift on populations, and in doing so are studying evolution as it occurs.

TODO

Editor's note
Chapter or book approach: The approach of this chapter / book is to work through a series of models. The first model will be the Hardy-Weinberg Model, and then progressively, the models will move away from the premises of Hardy-Weinberg.

Foundations of Population Genetics

Templeton states that the three premises of population genetics are the same premises for population genetics:

  1. Templates/DNA can replicate
  2. Templates/DNA can mutate and recombine
  3. Phenotypes emerge from the interaction of templates/DNA and environment.

Replication of Populations

There are three major properties that a population must maintain with replication:

  1. They are composed of reproducing individuals
  2. They are distributed over space and time
  3. They host a population of genes

The first property indicates that individuals of the population must reproduce to keep the population stable. This is necessary because individuals breakdown over time do to the introduction of entropy and inability for an individual to continuously remove the entropy added by the environment. Thus to maintain the population, individuals must pass down their DNA, or organizational encoding, to the next generation. Through continuous reproduction, a population can be maintained over a much longer time than the individuals that comprise it. In addition, the continuous reproduction of over time enables for the population to have properties and components of its own.

The second property is that a population is distributed over a space. Populations can exists as:

  • small isolated groups
  • a collection of groups with a varied amount of genetic exchange
  • a large interbreeding population that exists over a vast space

In general though the population can be divided into a primary group that can be considered as interbreeding and a secondary group that mates occasionally with the primary group. It is this primary group that population geneticists generally study, as it is generally stable. They define the group as a group of interbreeding individuals that share a common system of mating. The secondary group is generally ignored, and treated as noise in the system, unless it is having a major effect on the primary group.

The third property that must be maintained with reproduction is the population's gene pool. The gene pool is the collection of all the genes, organizational templates, in the population that can be used to create new individuals. By studying this gene pool, geneticists can determine the frequency of alleles, and or groups of alleles in the population and how they are changing over time. From the patterns of result that are obtained, geneticists then can start to understand what forces are acting on the population.

Template Mutations

Change is a requirement of evolution and one method of introducing change is through modification of the templates used sustain the population. In the case of living life, these templates are genes.

Sources of Mutation:

  • Insertions
  • Deletions
  • Single Nucleotide Substitutions (sometimes changing the protein sequences and sometimes not)
  • Transpositions
  • Duplications

An allele is an alternate form of an template. In the case of biological systems, an allele is a form of a gene. Zooming further out, a version of a region of templates is called a haplotype. Biological systems would call this a sequence of nucleotides, while a in a computer system, this would be a sequence of linked objects.

Modeling Evolution

Initially we are going to consider populations with genetic architectures of two loci per template.

Hardy-Weinberg Model

The genetic architecture of the Hardy Weinberg model is one locus, two allele model (Templeton, p. 35).

Hardy-Weinberg Equilibrium

If a population has no forces of evolution acting upon it is in Hardy Weinberg Equilibrium. Quantitatively it says that if the allele proportions of two alleles A and a are denoted p and q then the genotype proportions will be such that the homozygote AA will be of proportion p2, the heterozygote Aa will have proportion 2pq and the homozygote aa will be of proportion q2.

Testing for Hardy-Weinberg

To do this test you do what is known as a Chi Square test (\chi^2). Where:

 \chi ^2 = \sum \frac{(observed - expected)^2}{expected}

and the degrees of freedom is no. of Genotypes - no. of Alleles if χ 2 < table value then we accept that our population is in Hardy-Weinberg.

...note that we don't actually take a square, it is there out of tradition.

Concise Chisquare Table

Degrees of Freedom 1 2 3
χ 2 Value 3.84 5.99 7.83


Example

If there is a population with genotype proportions AA: 0.1 Aa: 0.4 aa: 0.5

So the allele proportions are  p = 0.1 + \frac{1}{2} 0.4 = 0.3

 q = 0.5 + \frac{1}{2} 0.4 = 0.7

So our expected values are AA: 0.09 Aa: 0.42 aa: 0.49.

 \chi ^2 = \sum \frac{(observed - expected)^2}{expected}
 \chi ^2 = \frac{(0.1-0.09)^2}{0.09} + \frac{(0.4 - 0.42)^2}{0.42}+ \frac{(0.5 - 0.49)^2}{0.49}
 \chi ^2 = 0.002 < 3.84

Therefore we can accept the hypothesis is in Hardy-Weinberg Equilibrium and that there are no forces of equilibrium on these alleles.

Two Autosomal Loci, Two Allele Model

Extending the H.W. model to two autosomal locus model with two alleles.

For the purpose of this discussion, the first locus will have alleles A and a and the second locus will have B and b. From this we can get the following gamete types and their frequencies through recombination:

Gamate Frequency
AB FreqAB
Ab FreqAb
aB FreqaB
ab Freqab
Sum 1

A population producing the above four gamates can produce the following genotypes:

Gamates AB Ab aB ab
AB FreqAB . FreqAB FreqAB . FreqAb FreqAB . FreqaB FreqAB . Freqab
Ab FreqAb . FreqAB FreqAb . FreqAb FreqAb . FreqaB FreqAb . Freqab
aB FreqaB . FreqAB FreqaB . FreqAb FreqaB . FreqaB FreqaB . Freqab
ab Freqab . FreqAB Freqab . FreqAb Freqab . FreqaB Freqab . Freqab

If you analyze the table above, it can be noticed that there are only ten unique combinations. Four combinations correspond to homozygous zygotes and the remaining six are the heterozygous zygotes.

Population Genetics - Gamate Mix (4 to 10).jpg

Notice that the sum of FreqAB, FreqAb, FreqaB, and Freqab is one. This follows from the earlier model where the sum of p and q equaled one.

Recombination

(this section is used to discuss how we get term r)

Homozygous Case

AB/AB or ab/ab or aB/aB or Ab/Ab

Single Heterozygous Case

AB/Ab or AB/aB aB/ab or Ab/ab

Double Heterozygous Case

AB/ab or Ab/aB

Note that recombination is only noticeable in double heterozygotes.

Linkage Disequilibrium

(this section is used to discuss how we get term D)

Linkage disequilibrium measure, \delta

Formally, if we define pairwise LD, we consider indicator variables on alleles at two loci, say I_1, I_2. We define the LD parameter \delta as:

\delta := \operatorname{cov}(I_1, I_2) = p_1 p_2 - h_{12} = h_{AB}h_{ab}-h_{Ab}h_{aB}

Here p_1, p_2 denote the marginal allele frequencies at the two loci and h_{12} denotes the haplotype frequency in the joint distribution of both alleles. Various derivatives of this parameter have been developed. In the genetic literature the wording "two alleles are in LD" usually means to imply \delta \ne 0. Contrariwise, linkage equilibrium, denotes the case \delta = 0.

Pulling Information from the Model

Is Evolution Occurring

If r > 0 and D != 0 then evolution is occurring

If r = 0 or if D equals 0, then no evolution is occurring

Mutation and Disequilibrium - Normalized Linkage Disequilibrium

Mating Systems

Clipboard

To do:
To be created.

Selection

Clipboard

To do:
To be created.

Absolute Fitness

Relative Fitness

Frequency before selection

Frequency after selection

Genetic Drift

TODO

Editor's note
This section is in a total flux right now and is being outlined.

First experiments were done in the early 30's. German was a major scientific language before the second world war.

Genetic drift is a function of population size N.

The effects of genetic drift is inversely proportional to population size. This means as the population increases, the deviation from expected allelic frequencies will decrease. (Templeton, p. 84).

Genetic drift is non-directional.

There is no attraction to return ancestral allele frequencies.

Genetic drift is a cumulative function. Changes in allele frequencies from the previous generations are added to the changes that occur in the current generation.

Genetic drift is only occurs when there is variability.

To study genetic drift this section will create a simplified model and expand upon it. The model will just have two states, a) genetic drift is occurring and b) genetic drift is not occurring.

a->b fixation b->a mutations causing differentiable patterns b->a reintroduction of differentiable patterns

Genetic drift can be broken down into two simplified states to start. The f

Example

In each population, 2N=32, po=qo=0.5 initially and then did 19 generations of data. (Buri, P. 1956 "Gene frequency in small populations of mutant Drosophila. Evolution 10: 367-402

(See Figure 6.3 from Hedrick, P.W. 2005, Genetics of populations, 3rd edition. Jones and Bartlett, Sudbury, MA)

Note that in this figure, variance is increasing, but mean allele frequency over populations is staying relatively the same.

Simulation of Genetic Drift

Genetic Drift can be simulated using a Monte Carlo Simulation. (See Figure 6.2 in Hedrick)

The proportion of populations expected to go to fixation for a given allele is equal to the initial frequency of that allele

Only the allele frequencies are changing and the distribution of the allele frequency. The mean allele frequency over multiple replicate populations does not change due to genetic drift.

Heterzygositity or the variance of the allele frequency over the replicate population can be used to understand genetic drift.

We can calculate the number of generations necessary to reduce the reduce the heterozygostity

t=ln(x)*-2*N, where x is how much heterozgostity is left, and N is the population size.

Coalescence Theory

Clipboard

To do:
To be created.

Furthur Reading in Wikipedia

Glossary

Gene Pool - p. 37 and p. 38 of Templeton's book

Formulas

Hardy Weinberg

Example 1

Homozygotes Heterozygotes Sum
Observed 85 15 100
Expected 60 40 100

The probability of observing these numbers is:


 P=\frac{N!}{(N_{homo,~observed}!)\cdot(N_{hetero}!)}\cdot freq(Homo_{expected})^{N_{homo,~observed}}\cdot freq(Hetero_{expected})^{N_{hetero,~observed}}


P=\frac{100!}{(85!)\cdot(15!)}\cdot (0.6)^{85}\cdot (0.4)^{15}

Example 2

Genotype Count
Homodom 20
Hetro 50
Homorec 30

What are the observed frequencies of p and q?

p = \frac{2\cdot Homo_{dom}\cdot Hetero}{2\cdot N}

q = \frac{2\cdot Homo_{rec}\cdot Hetero}{2\cdot N}


p = \frac{2\cdot 20 \cdot 50}{2\cdot 100}

q = \frac{2\cdot 30 \cdot 50}{2\cdot 100}

What are the expected number of genotypes?

freq_{homo,~dom}=p^2 N_{homo}=freq_{homo,~dom}\cdot N
freq_{het~}=2pq N_{hetero}=freq_{hetero}\cdot N
freq_{homo,~rec}=q^2 N_{homo}=freq_{homo,~rec}\cdot N

Do the observed genotypes fit H-W expectation?

\chi^2=\sum\limits^{n}_{i=1}{}\frac{observed_i-expected_i}{expected_i}

G=2\sum\limits^{i=1}_{n}observed_i \cdot \ln{\frac{observed_i}{expected_i}}

If χ2 greater than degree of freedom, then the null hypothesis, in this case it fits H.W. can be rejected.

Example 3 - Next Generation

For an autosomal locus

p_{next~generation}=homo_{dom}\cdot 1/2 \cdot(hetero)

q_{next~generation}=1-p_{next~generation}

Example 4 - Sex Linked (1 locus, 2 allele)

p_{next,~male}=p_{female}

q_{next,~male}=q_{female}

p_{next,~female}=\frac{p_{female}+p_{male}}{2}

q_{next,~female}=\frac{q_{female}+q_{male}}{2}

It is possible to solve for the amount of time it will take till an initial frequency is less than or equal to a particular threshold by using the following formula:

\frac{deviation_{threshold}}{deviation_{initial}}=(-0.5)^{t}


t=\max{}\left( \frac{\log{}\frac{deviation_{threshold}}{deviation_{initial}}}{-\log0.5}\right)


Example 5 - Inbreeding - Self Pollination

Genotype Count
Homodom 20
Hetro 50
Homorec 30

Calculate F Value

p = \frac{2\cdot homo_{dom} + hetero}{2N}

q = \frac{2\cdot homo_{rec} + hetero}{2N}

Heterozygosity_{observed}=H_{o}=\frac{Hetero_{observed}}{N}


Heterozygosity_{ expected } =H_{e} = 2pq\,


F=1-\left( \frac{H_{observed}}{H_{expected}}\right)

Chi-squared then can be calculated by:

\chi^{2}=F^{2}N\,

Example 6 - Inbreeding (1 Locus / 2 alleles)

Given the frequency of the dominant allele and an F value:

Allele Frequency
p 0.6
q 0.4


Heterozygosity_{expected}=H_{e}=2pq\,


Heterozygosity_{observed}=H_{o}=(1-F)\cdot H_{e}

What sample size would be necessary to detect an effect of F=0.01 at the S% significant level?

  1. Look up the X2 value for S% significant level (usually S = 5%)
  2. Use the following formula:

N=\frac{\chi^2 }{F^2}

Example 7 - Null Allele

Genotype Count
Homodom 20
Hetro 50
Homorec 30

p = \frac{2\cdot homo_{dom} + hetero}{2N}

q = \frac{2\cdot homo_{rec} + hetero}{2N}

Heterozygosity_{observed}=H_{o}=\frac{Hetero_{observed}}{N}

Heterozygosity_{ expected } =H_{e} = 2pq\,

p_{null}=\frac{H_E-H_O}{1+H_E}


p_{adjusted}=p \cdot (1-p_{null})

q_{adjusted}=q \cdot (1-p_{null})

Selection Part 1

Example 1

Genotypes Homodom Heterozygotes Homorec
Viability 0.7 1 0.6
Fertility 11 6 3


Genotypes Homodom Heterozygotes Homorec
Viability (v) 0.7 1 0.6
Fertility (f) 11 6 3
Absolute Fitness (W) v_{homo,~dom}\cdot f_{homo,~dom} v_{hetero}\cdot f_{hetero} v_{homo,~rec}\cdot f_{homo,~rec}
Relative Fitness (w) \frac{W_{abs,~homo,~dom}}{W_{abs,~max}} \frac{W_{abs,~hetero}}{W_{abs,~max}} \frac{W_{abs,~homo,~rec}}{W_{abs,~max}}

What are the selection coefficients?

sel_{homo,~dom}=1-w_{rel,~homo,~dom}

sel_{hetero}=1-w_{rel,~hetero}

sel_{homo,~rec}=1-w_{rel,~homo,~rec}


Example 2 - Mean Fitness Calculation

Genotypes Homodom Heterozygotes Homorec
Freq 0.3 0.6 0.1
Rel. Fitness 1 0.6 0.1

What is the mean fitness of this population?

w_{mean}=w_{rel,~homo,~dom}\cdot Homo_{dom} +w_{rel,~hetero}\cdot Hetero + w_{rel,~homo,~rec} \cdot Homo_{rec}

Example 3 - Natural Selection Analysis

Genotypes Homodom Heterozygotes Homorec
Freq 0.3 0.6 0.1
Rel. Fitness 1 0.6 0.1

What is the mean fitness of this population?

w_{mean}=w_{rel,~homo,~dom}\cdot Homo_{dom} +w_{rel,~hetero}\cdot Hetero + w_{rel,~homo,~rec} \cdot Homo_{rec}

p = Homo_{dom} + \frac{Hetero}{2}

q = Homo_{rec} + \frac{Hetero}{2}

Homo_{dom,~next}=\frac{w_{rel,~homo,~dom}\cdot p^2}{w_{mean}}

Hetero_{next}=\frac{w_{rel,~homo,~dom}\cdot 2pq}{w_{mean}}

Homo_{rec,~next}=\frac{w_{rel,~homo,~rec}\cdot q^2}{w_{mean}}

p_{next} = Homo_{dom,~next} + \frac{Hetero_{next}}{2}

q_{next} = Homo_{rec,~next} + \frac{Hetero_{next}}{2}

\Delta p = p_{next} - p\,

Selection Part 2

Example 1

Homodom Heterozygotes Homorec
Relative Fitnessfemales 0.6 0.8 1
Relative Fitnessmales 1 0.8 0.6

What are the equilibrium frequencies for the pf and qf alleles in females considering that heterozygotes have intermediate fitness?

selection_{female}=s_{f}=1-fitness_{homo,~dom,~female}=1-w_{11,~f}

selection_{male}=s_{m}=1-fitness_{homo,~dom,~male}=1-w_{11,~m}

q_{equilibrium,~female}=\frac{s_f - 1}{s_f}+(\frac{s_m \cdot s_f - s_f - s_m + 2}{2s_f \cdot s_m})^{(1/2)}]

p_{equilibrium,~female}=1-q_{equilibrium, female}\,


What are the equilibrium frequencies for the pm and qm alleles in males considering that heterozygotes have intermediate fitness?

q_{equilibrium,~male}=\frac{1}{s_m}-(\frac{s_m \cdot s_f - s_f - s_m + 2}{2s_f \cdot s_m})^{(1/2)}]

p_{equilibrium,~male}=1-q_{equilibrium, male}\,

What range of sf values will give a stable polymorphism?

\frac{s_m}{1-s_m}>s_f>\frac{s_m}{1+s_m}

What range of sm values will give a stable polymorphism?

\frac{s_f}{1-s_f}>s_m>\frac{s_f}{1+s_f}

Example 2 - Complete Dominance w/ Frequency Dependent Selection

Homodom Heterozygotes Homorec
Relative Fitness (w) 1.3 1.3 1.7

What are the selection coefficients shomo, dom and shetero with q = 0.02?

w_{homo,~dom}=w_{hetro}=1+\frac{s_{homo,~dom}}{1-q^2}

(w_{homo,~dom}-1) \cdot (1-q^2) = s_{homo,~dom}

s_{homo,~dom}=s_{hetero} = (w_{homo}-1) \cdot (1-q^2)


w_{homo,~rec}=1+\frac{s_{homo,~rec}}{q^2}

(w_{homo,~rec}-1) \cdot (q^2) = s_{homo,~rec}

s_{homo,~rec}= (w_{homo}-1) \cdot (q^2)

Calculate the stable equilibrium frequency, qe and pe.

q_e = (\frac{s_2}{s_1+s_2})^{(1/2)}

p_e = 1-q_e\,

Calculate the mean fitness assuming HW equilibrium and using:

(a) current allele frequencies

w_{mean}=w_{homo,~dom}\cdot homo_{dom}+w_{hetero}\cdot hetero + w_{homo,~rec}\cdot 

homo_{rec}

w_{mean}=w_{homo,~dom}\cdot (p)^2+w_{hetero}\cdot 2pq+ w_{homo,~rec}\cdot 

(q)^2

(b) stable equilibrium allele frequencies

w_{mean}=w_{homo,~dom}\cdot homo_{dom}+w_{hetero}\cdot hetero + w_{homo,~rec}\cdot 

homo_{rec}

w_{mean}=w_{homo,~dom}\cdot (p_e)^2+w_{hetero}\cdot 2 p_e q_e+ w_{homo,~rec}\cdot (q_e)^2

(c) general formula

w_{mean} = 1 + s_1 + s_2

Example 3 - Self Incompatibility Locus

Given initial frequencies:

(a) What are the frequencies of the S1, S2, and S3 alleles initially, and after one and two generations?

p_1=(1/2)\cdot (P_{S_{1}S_{2}}+P_{S_{1}S_{3}})

p_2=(1/2)\cdot (P_{S_{1}S_{2}}+P_{S_{2}S_{3}})

p_3=(1/2)\cdot (P_{S_{1}S_{3}}+P_{S_{2}S_{3}})

(b) After one generation?

P_{next,~S_{1}S_{2}}=(1/2)\cdot (1-P_{S_{1}S_{2}})

P_{next,~S_{1}S_{3}}=(1/2)\cdot (1-P_{S_{1}S_{3}})

P_{next,~S_{2}S_{3}}=(1/2)\cdot (1-P_{S_{2}S_{3}})


p_{next, ~1}=(1/2)\cdot (P_{next,~S_{1}S_{2}}+P_{next,~S_{1}S_{3}})

p_{next, ~2}=(1/2)\cdot (P_{next,~S_{1}S_{2}}+P_{next,~S_{2}S_{3}})

p_{next, ~3}=(1/2)\cdot (P_{next,~S_{1}S_{3}}+P_{next,~S_{2}S_{3}})

(b) After two generations?

P_{next,~S_{1}S_{2}}=(1/2)\cdot (1-P_{prev,~S_{1}S_{2}})

P_{next,~S_{1}S_{3}}=(1/2)\cdot (1-P_{prev,~S_{1}S_{3}})

P_{next,~S_{2}S_{3}}=(1/2)\cdot (1-P_{prev,~S_{2}S_{3}})


p_{next, ~1}=(1/2)\cdot (P_{next,~S_{1}S_{2}}+P_{next,~S_{1}S_{3}})

p_{next, ~2}=(1/2)\cdot (P_{next,~S_{1}S_{2}}+P_{next,~S_{2}S_{3}})

p_{next, ~3}=(1/2)\cdot (P_{next,~S_{1}S_{3}}+P_{next,~S_{2}S_{3}})


Example 4 - Fitness / Overdominance

Homodom Heterozygotes Homorec
Relative Fitness 0.4 1.0 0.8

To find the equilibrium of p:

s_{homo,~dom}=1-w_{homo,~dom}

s_{hetero,~dom}=1-w_{hetero,~dom}

s_{homo,~rec}=1-w_{homo,~rec}


p_{eq}=\frac{s_{homo,~rec}}{s_{homo,~dom}+s_{homo,~rec}}

q_{eq}=1-p_{eq}\,

Example 5 - Selection and Viability

Genotypes Homodom Heterozygotes Homorec
Zygotic Frequencies 0.05 0.33 0.62
Viabilities 1 0.6 0.59


Genotypes Homodom Heterozygotes Homorec
Zygotic Frequencies (Z) 0.05 0.33 0.62
Viabilities (V) 1 0.6 0.59
Adult Freq After Selection \frac{Z_{homo,~dom}\cdot V_{homo,~dom}}{w_{mean}} \frac{Z_{hetero}\cdot V_{hetero}}{w_{mean}} \frac{Z_{homo,~rec}\cdot V_{homo,~rec}}{w_{mean}}

w_{mean}=Z_{homo,~dom}\cdot V_{homo,~dom}+Z_{hetero}\cdot V_{hetero}+Z_{homo,~rec}\cdot V_{homo,~rec}

p_{sel}=Homo_{dom,~sel}\cdot \frac{Hetero_{sel}}{2}

q_{sel}=Homo_{rec~sel}\cdot \frac{Hetero_{sel}}{2}

or

q_{sel}=1-p_{sel}\,

Predicted Genotype Frequencies

Homo_{dom, exp}=p_{sel}^2\,

Hetero_{exp}=p_{sel}^2 + 2p_{sel}q_{sel}\,

Homo_{rec, exp}=q_{sel}^2\,

Example 6

Homodom Heterozygotes Homorec
Observed Numbers 750 1000 250

What are the absolute and relative fitness for each?


Homodom Heterozygotes Homorec
Observed Numbers 700 1000 200
Predicted Numbers 500 1000 500

w_{abs,~homo,~dom}=\frac{Homo_{dom,~obs}}{Homo_{dom,~exp}}

w_{abs,~hetero}=\frac{Hetero_{obs}}{Hetero_{exp}}

w_{abs,~homo,~rec}=\frac{Homo_{rec,~obs}}{Homo_{rec,~exp}}


w_{abs,~max} = \max(w_{abs,~homo,~dom}, w_{abs,~hetero}, w_{abs,~homo,~rec})


w_{rel,~homo,~dom}=\frac{w_{abs,~homo,~dom}}{w_{abs,~max}}

w_{rel,~hetero}=\frac{w_{abs,~hetero}}{w_{abs,~max}}

w_{rel,~homo,~rec}=\frac{w_{abs,~homo,~rec}}{w_{abs,~max}}


Behavioral Genetics

TODO

Editor's note
This chapter is empty


Quantitative Genetics

General Genetics/Quantitative Genetics


Mutations and Evolution

Its often not grandiose, overlapping theories such as the Neutral Theory of Evolution, or Modern Synthesis, that biologists work with in their research on a day-to-day basis. Instead, it is the beautiful basics of genetics that are the cornerstone of experiments. The basics are then used to reflect on bigger ideas, such as population ecology and evolution. It is the beautiful basics of genetics with which we concern ourselves in this section.

Every organism has a set of genes and half of the genes of that organism come from each parent. The combinations of the genes causes the variation of individuals within the species. The genes of a butterfly, an ape or a fowl carry the code that determines the appearance and the character of the butterfly, the ape and the fowl. The genetic code allows an overwhelming variety within the species of the kind. A mutation is basically a gene that has an abnormality in relation to its normal configuration. The abnormality can then be passed to successive offspring, thereby producing a marked difference. There are many different types of mutations. The smallest possible genetic mutation is a 'point-mutation'. This occurs in the DNA when the base-pairs combine with the 'wrong' partner. Multiple point-mutations are common and are found to increase substantially by the effect of mutagens. Mutations are, of course, heritable and these can extend to whole or part chromosomal mutations. Because many genes are affected by a chromosomal mutation, these often have drastic ramifications on the offspring.


Ethical Issues

There many ethical issues with cloning, and genetics in general.

In the United states, many supermarkets are already selling Genetically Modified foods. Most of the produce usually creates its own pesticide that will eat the insects from the inside out, a trait borrowed from a group of bacteria that, to our knowledge, has no adverse affects on humans. The ethical issue comes into play here: most supermarkets do not advertise these foods as Genetically Modified. Even though no adverse affects are known to date, not allowing the customer to know that they are purchasing GM foods is controversial. There are two main sides of the argument:

1) GM foods won't harm you, so why tell the people? If they have adverse affects once you tell them, it could be a self-fulfilling prophecy.

2) Even if it won't hurt us, you should still tell us so we have a choice in the matter. That's like giving us a certain drug without telling us: it's not ethical.


If you go into your local supermarket and ask the General Manager if they know whether GM foods are being sold, the answer will probably be a resounding "No", since there is no separate signing for these foods. This reflects an underlying problem with the genetically-modified food discussion. Which foods are genetically modified? Many seedless fruits exist as a result of primitive genetic manipulation in the form of accelerated artificial selection. Products such as corn and apples could not exist without human intervention in the form of breeding and cross-species grafting. If we place the boundary at the laboratory door, we demonize certain foods that have saved billions of lives, specifically golden rice and others with which we have become perfectly comfortable.

The main issue with the w:genetically modified foods is that their long term effect on the Earth's ecosystem is unknown. To improve a plant by natural selection took a long time and there was time for the ecosystem to adapt.


Cloning endangered species may be a good idea to keep the ecosystem in balance.

Last modified on 2 December 2012, at 13:59