BIOL 4160

Evolution

Phil Ganter

301 Harned Hall

963-5782

Pollen newly released from the maturing flowers on this spathe

Genetic (and some Phenotypic) Variation

Email me

Link to a list of Specific Objectives for lectures

Back to:

Academic Page 
Tennessee State Home page 
BIOL4160 Page 
Ganter home page 

Types of Gene Variation

Structure of Genetic Material

  • Gene Structure
    • Alleles are varieties of Genes found at a particular Locus in the Genome
      • But the term gene is used loosely by many
      • concept of gene has expanded and altered well beyond the one-gene, one protein hypothesis
    • Types of Genes
      • Protein Coding Genes
        • Coding Sequence and Exons
          • Alternative Splicing multiple exons can be combined in various ways to produce multiple proteins from one gene
          • Exon Shuffling - constructing new genes by combining exons from different loci
        • Introns (INTRagenic regiON - Walter Gilbert)
          • present in eukaryotic pre-mRNA (also called hnRNA - heterogeneous nuclear RNA)
          • Four classes
            • Self-Splicing Group I and Self-Splicing Group II Introns
              • both important in organelle genes and Group II important in rRNA (ribosomal RNA) processing, both are spontaneous splicing, Type II requires guanine nucleoside as catalyst
            • Spliceosomal Introns and Enzymatically-spliced Introns
              • Spliceosome is a complex of proteins and RNAs, mechanism related to Group II self-splicing
              • Enzymatically spliced mechanism not like other splicing mechanisms
        • Enhancer Region(s) where Transcription Factors (both enhancers and repressors) bind to regulate RNA polymerase binding
          • can be a long way from rest of gene as intervening DNA can loop and bring enhancer site close to promoter
        • Promoter Site where RNA polymerase binds to DNA
        • Poly-Adenylation Addition Site
          • binds the cleavage complex (that cleaves the RNA) and polyadenylate polymerase (that adds up to 250 or so A's)
        • Mature mRNA is stabilized by 5'-GTP Cap and the 3' poly-A tail
      • Non-Coding RNA Genes (only a partial listing)
        • Ribosomal RNA (rRNA) Genes for RNA]'s found in ribosome
          • Cleaved from a single RNA molecule and the gene for this large molecule is Tandemly Repeated
        • Transfer RNA (tRNA) Genes for binding to Amino Acids
        • Telomerase RNA - part of complex responsible for building telomeres
        • RNA Genes for RNAs involved in gene regulation
          • Antisense RNA (aRNA) Genes that bind to mRNAs
          • MicroRNA (miRNA) Genes
            • average only 19-25 bp, in eukaryotic cells only, and are post-transcriptional regulators that bind to complementary sequences on mRNA
            • less than 100% complementarity so they can bind to multiple transcripts (gene silencing)
            • human genome has about 1000 miRNAs
          • Small Interfering RNA (siRNA) Genes that help regulate protein production in most eukaryotes through the RNA interference (RNAi) pathway (21-22 bp)
            • 100 % complementarity and so target specific genes
            • usually act by cleaving mRNA (gene silencing) and may be important immunity genes in organisms without cell-mediated immunity (plants, non-mammal animals)
          • Long NC RNA Genes are regulatory but act in multiple ways
          • Several RNA species that seem to bind to invasive nucleic acids (piRNA against transposon activity is an example)
        • untranslated small RNA's
  • Genome Structure
    • Genomes are still being explored and much data is only preliminary
      • Bioinformatics is a series of mathematical tools and programs used in the exploration
      • Annotation is the process of identifying gene sequences
    • Genome size varies greatly
      • Prokaryote genomes usually less than 10 Mbp (mega base pairs or 1 million base pairs) but eukaryotic genome sizes can be over 100 billion base pairs (100 Gbp)
    • Synteny is the order of the loci on chromosomes (actually means that the same loci should be found on homologous chromosomes)
      • Synteny is altered when translocations and duplications alter genes
    • Single Copy sequences
      • Single copy sequences include both Protein-Coding Genes and Non Protein-Coding Genes (some small amount of repetition may occur here)
    • Repetitive DNA sequences
      • rDNA Tandem Repeats (ITS = Internal Transcribed Spacer), multiple copies of tRNA genes
      • Microsatellites short repeated sequences (microsatellites = simple sequence repeats or SSRs) - 2-8 base pairs, tandem repeats (# of repeats variable) used to map an allele
      • Autonomously Replicating Sequences (large diverse group, only some discussed here)
        • Transposons
          • lots of them, some are DNA and some are RNA
          • Can cause mutations by inserting into a gene or a gene's control regions
          • Retrotransposons are most common and code for several proteins (including Reverse Transcriptase or RNA-dependent DNA Polymerase)
        • LINES (long interspersed nuclear elements) repeated sequences up to 1000 to 5000 bp in length and up to millions of copies
          • Some have both Reverse Transcriptase and Integrase genes and can replicate like any other transposon and some have lost that function and no longer replicate themselves
          • Differ from transposons in that they do not move from place to place in the genome but make copies of themselves and, when the copies integrate, the genome size increases
          • Most copies lose functionality, so they do not copy themselves
          • Evidence for evolution of a balance between LINE expansion and the negative effect of so much unproductive DNA is scarce
          • There is evidence that some genes resulted from integration of a LINE (or a SINE) into a pre-existing gene, so LINES and SINES may have a role in producing genes with novel functions
        • SINES (shorter than LINES) are less than 500 bp long and can occur in millions of copies in some genomes
          • SINES never code for RTase and so only spread in the genome when other transposons or LINES provide the means to do so
          • ALU elements most common repeat in humans - contain ALU-1 restriction enzyme site (hence the name) and were mutated from an RNA that functions in the signal-recognition particle (part of the mechanism for targeting mRNAs to the endoplasmic reticulum)
          • ALU insertions are implicated in some human diseases
        • Latent Viruses
          • It's a continuum from virus' that never integrate and always have protein coats to latent viruses and virus-like sequences that replicate
    • Gene families (based on protein families)
      • A set of related loci formed by local gene duplications
      • Usually have similar functions (LDH family, Hemoglobin family)
    • Pseudogenes  are genes that have lost function through mutation
      • may be important in gains of novel functions as the can accumulate mutations while not functional and eventually mutate back to functionality
      • many gene families have related pseudogenes (Hemoglobin family)

Mutations

Point / Indel Mutations and Third Codon Degeneracy

  • (Transversion, Transitions, Synonymous and Non -Synonymous, Frame Shift, Sequence Termination by creation of Stop Codon)
    • INDEL  (insertion or deletion) mutations often mean loss or alteration of function
    • Single-Nucleotide Polymorphisms (SNPs) are useful mapping markers and can label individuals
  • Neutral Mutations have no effect on the fitness of the organism
    • Synonymous nucleotide substitutions (usually 3rd position of the codon) are neutral because there is no alteration of the phenotype
    • If a mutation alters a protein, it may do so in such a way as to not alter the protein's function, so this would also be a neutral mutation
    • Neutrality is probably most common outcome
  • Deleterious Mutations decrease fitness
  • Pleiotropic effects - a gene that affects more than one phenotypic trait (eye color mutations of Drosophila exhibit pleiotropy)
  • Beneficial mutations occur at a low rate but this is expected in Darwin's gradualist view of evolution
  • Epistasis is when two or more loci interact in their effects on a phenotype

Recombination Mutations

  • Gene conversion using one allele to alter the other allele so that it is identical to the first (the second has been converted to the first)
    • happens during crossing-over and is caused by mismatch repair of incorrectly paired DNA strands
  • Intergenic Recombination - crossing-over of short sequences within a larger gene sequence
  • Unequal Crossing Over
    • causes one DNA strand to lose a section and the other to gain a section
    • the DNA strand that gained can then have a Gene Duplication
    • Most common where there already is a tandem repeat (misalignment of repeats)
    • Probably responsible for much of the amplification of LINES and SINES
    • Transposons and other repeated sequences can cause Rearrangements if paralogous copies align during synapsis

Forward and Back Mutations

  • These terms are "pre-sequencing" when all gene mutations were detected by changes in phenotype
  • Forward mutations are more common than back (many things can change an allele but only a very few specific changes will restore it)

Variation in gene structure vs variation in gene expression

  • variation in both leads to phenotypic variation

Mutation rate (usually point mutations)

Estimated in lab experiment or by comparison of orthologs in different lineages (usually different species)

  • Need to reduce the chance that natural selection has altered the rate and measure just the changes caused by random error
    • Concentrate on pseudogenes, other untranscribed sequences, and four-fold degenerate sites
  • Key to this is to understand that:
    • Each substitution produces two alleles at that site
    • One is very common (wild type) and the other is represented by one copy in the population or species
    • The chance that mutation will become so common that it replaces the original allele is, if only random chance is involved, equal to the rate at which mutations arise (more on the when we cover genetic drift)
  • So, if we count the number of replacements, we can estimate the mutation rate
    • There is a need to correct for the chance that, at any site that mutated, a second mutation occurred that changed it to another base or restored the original base

Empirical measurements indicate that mutation rates vary among lineages, loci, and even within loci

  • Remember that time here is difficult to compare between different organisms
    • Different generation times mean that, for a given number of years, some lineages will have more opportunities to mutate (which happens when the genomes are replicated
  • Only want to consider germ line cells, not somatic cells

That said, the variation in rate (per base pair per replication) is not so large (probably due to similarity to replication process in all organisms)

  • chance of mutating is about 0.3 x 10-10 to 6 x 10-10 per base pairs per replication
  • If we assume a rate of 3 x 10-10 for our genome, then each time we replicate it (7 x 109 base pairs in our diploid genome), there is only a 1 in 5 chance of getting any changes at all
  • But, from the zygote to the egg or sperm, there is at least 100 replications, and so we can expect at least 20 mutation in each gamete

Mutagens are chemicals that alter the chance of mutations (always increase the chance), which is why we try to restrict releasing some chemicals into our environment

Chromosomal Variation

  • Ploidy Changes
    • Aneuploidy, Polyploidy (Allopolyploidy, Autopolyploidy)
  • Inversions
  • Translocations
  • Fission and Fusion
  • Karyotype Variation

Figure 1 - Karyotypes of 36 strains of an asexual yeast, Candida sonorensis, showing the sorts of extreme karyotype variation found with asexual "species."   It is often hard to be sure that a yeast is truly asexual but it is hard to see how synapsis during meiosis could be achieved between some of the strains of this yeast.  There are no known phenotypic differences between the strains listed below.  This raises two related questions.  First, are we missing important parts of the phenotype?  If not, where did the extra DNA in the larger genomes come from and what does it do?

Mutation, Variation, and Randomness

Variation is ultimately the outcome of mutation

Mutation is a random process

  • Thus, mutations do not happen when the will help an organism -there is no "directed mutation"
    • However, development (and other constraints) may mean that not all conceivable phenotypes are possible
  • Recent challenges to this assertion have been shown to be wrong and have reinforced the assumption of randomness in mutations

Mutation rate is not a random effect

  • Different lineages have different rates of synonymous mutation
  • Different regions of the genome have different mutation rates
  • Environmental conditions can alter mutation rate

Although mutations are random, variation is the outcome of many mutations and is predictable!

Natural selection, in cases where there is a single allele or combination of alleles, genetic drift, and inbreeding will work to reduce variation in a population, which can only be replenished by migration of new alleles or mutation

Problem:  When lab populations of animals are subject to artificial selection, the most common response it that the character being selected changes in the direction encouraged by selection.  So, one can reduce the number of facets (individual visual cells) in the eye of a Drosophila by selecting for this by only allowing flies with the fewest facets to contribute eggs to the next generation.  Wim Scharloo did a laboratory selection experiment with a population of 1000 flies.  After several generations of selection during which the character selected changed, change stopped.  Prof. Scharloo then made one change in the experiment.  He increased the population size from 1,000 to 10,000.  Selection almost immediately became effective again and the character continued to change.  Why did the expansion of the population restart the evolutionary process?

Mutation and Fitness

  • Neutral mutations are those that do not affect fitness (no matter why)
  • Pleiotropy
    • when the output from a particular locus affects more than one character, the gene's effects are said to be pleiotropic
    • a mutation's effect must be assessed for all characters affected by that locus as the mutation's effects may differ by character affected
  • For those mutations that increase or decrease fitness, remember that fitness is not the property of an allele, but the outcome of an allele in a particular environment (both physical and biological environment)
    • A mutation that alters coloration of prey is only as important as the risk of being eaten
  • It is assumed that it is easier to harm a complex machine by randomly changing its parts than to improve it by random change (mutation), so harmful mutations are expected to be more common than beneficial mutations
    • Text example of bacterial evolution in which 1 in 150 mutations were beneficial (and the average fitness increase was 3% is surprising for how many beneficial mutations occurred and for how beneficial they were

Phenotypic Variation

This is an extensive subject and all we will do here is to point out some basic relationships between genetic variation and phenotypic variation

Phenotypic variation is the degree of differences between the physical characteristics of related organisms

Sources of Phenotypic Variation

  • Genetic Variation - discussed previously
  • Environmental Variation - differences among individuals due to the influence of their environment (including their biotic environment)
    • Usually measured by measuring phenotypic differences when the genotype is held constant
  • Developmental Noise - the differences in individuals of the same genotype raised under identical environmental conditions
  • Maternal Effects - these are differences caused by non-genetic influence of the mother on her offspring
    • Variation among ova (not DNA, but differences in the stocking of the egg with energy and food resources, specific proteins and RNAs
    • Variation in mother's condition when producing eggs or carrying offspring
    • Variation in maternal care (this can be due to father as well)
  • Epigenetic Inheritance - differences in genetic expression of a locus not based on sequence differences among alleles
    • Liver cells, in culture, undergo mitosis but produce only liver cells - they do not revert to zygote or stem cell status
    • Genetic imprinting - dealt with in Evo-Devo chapter

Describing Phenotypic Variation - the measure used is Variance (from statistics) in a character, which is a measure of the deviation of individuals from the mean character value (assumes one can use numbers to measure the character)

  • At the simplest level, the variance in a trait within a population or species can be divided into two additive portions:

Phenotypic Variation (Vp) = Genetic Variation (VG) + Environmental Variation(VE)

  • Phenotypic plasticity - a single genotype often can produce more than one phenotype if the environment in which the organism develops changes - this is the Reaction Norm of that genotype (all possible phenotypes from a single genotype)
    • In this case (which may be the usual case), then we must alter the partitioning of phenotypic variance:

Phenotypic Variation (Vp) = (VG) + (VE) + Genotype x Environment Interaction (VGxE)

Recombination and Variation

Parasexual recombination (conjugation, transduction, transformation)

  • Horizontal versus Vertical Transmission
  • Recombination at the molecular level
  • Homologous and Non-homologous

Sexual recombination

  • combinations of genes are not preserved unless the genes are closely linked (no linkage is ever tight enough to completely prevent recombination)
  • Recombination can be intergenic
  • Recombination produces new combinations of genes each generation
  • To preserve favorable combination of genes, some other process must operate (positive assortative mating is one possibility)

Linkage

  • Physical linkage means that the loci are close enough on a chromosome that they are likely to be inherited together
  • If two loci each have two alleles in a population and the proportion of each allele is 0.50, then unlinked genes should be in Linkage Equilibrium
    • in this case, 25% AB, 25% Ab, 25% aB, and 25% ab,
  • Linkage Disequilibrium is a significant departure from the proportions expected from linkage equilibrium
    • In the case above, if Ab is one chromosomal type in the population and aB is the other (and no recombination occurs because the linkage is so tight) you get 50% Ab, and 50% aB (no recombinant allele parings [AB or ab] are formed)
    • Linkage disequilibrium, then, is a measure of the inhibition of recombination and indicates some evolutionary process may be affecting the outcome of recombination (assortative mating, selection, etc.) in addition to simple physical linkage

Hitchhiking - when one allele is changing frequencies due to selection (for or against), neighboring alleles may also change if closely linked

Hardy-Weinberg

Variation is a population-level phenomenon (emergent property of populations) and a necessary condition for evolution

What should we expect to happen over time when variation exists in a population?

Hardy-Weinberg expectations are predictions of future population variation when that variation is not altered by ecological or statistical processes

p2 + 2pq + q2

H-W Assumptions - Hardy-Weinberg predicts no change but is only accurate if its 5 assumptions are met.  Below we list the assumptions and discuss what happens when the assumption is violated.

No mutation,

Mutations generate differences between generations and upset H-W prediction

No migration,

If populations differ in their genetic composition (maybe A is 90% of the genes at a locus in one population and only 10% in another population), migration between the populations can change their genetic composition

Random Mating,

Assortative mating (also called Non-Random Mating)

  • Positive Assortative Mating - if like mates with like (due to choice or to small population sized not allowing much choice) then intermediates and heterozygotes are lost - a decrease in genetic variation
  • Negative Assortative Mating - like mating with unlike will increase the proportion of heterozygous intervals and preserve genetic variation

Inbreeding

  • has the same effect as positive assortative mating - loss of genetic variation
    • two related individuals are more likely to have a rare allele, given that one of them does, than two individuals chosen at random from the entire population, thus rare recessive alleles are more likely to become homozygous in inbred offspring
  • can (not must, but can) lead to lower viability of inbred individuals or to lower fecundity
    • Heterosis - condition where the heterozygous individuals show greater fitness (viability, fecundity) than do individuals homozygous for either of the alleles
    • Inbreeding Depression - loss of fitness due to inbreeding as more and more recessive, less fit alleles are expressed due to inbreeding
  • more likely in small populations
  • often there are physical or behavioral barriers to inbreeding

Large Populations,

Genetic Drift

  • loss of genetic variation due to chance events
  • more likely in small populations than in large
    • Neighborhoods can enhance the effect of drift
    • if populations are subdivided into small neighborhoods, then drift will be more important for the entire population
  • Bottleneck - a sudden low point in populations numbers, followed by expansion of the population
    • Bottlenecks can reduce genetic variation in a generation through genetic drift, even though population numbers are generally high
    • If you come along when the population has recovered its large size, you would think that genetic drift was not important in that population, but a recent bottleneck event might have greatly reduced genetic variation in your study population.

    Founder Effect

    • if new populations are formed by the migration of just a very few individuals, the population can be said to have gone through a bottleneck at its founding
    • founder effect can mean that new populations are different from parent populations through chance alone
  • No Selection

    Natural selection is the outcome of fitness differences between individuals

    • Natural selection requires that there is heritable genetic variation in a population
    • if some of those genetic variants are more fit (better able to survive and reproduce) than others, the fit genetic variants will leave more offspring that also have their "fit" genotypes
    • as time goes on, more of the population are descended from the more fit individuals
      • An example - Peppered Moth melanic forms favored when trees are darkened, light form when trees are lighter
        • selective factor is mortality due to bird predation
        • melanic gene has other effects, but none are strong enough to explain the population changes seen in England
        • in the US, melanic form has declined even though trees are not becoming lichen covered, so NS by bird predation may not work for all cases of Industrial Melanism
      • Prevalence of resistance to herbicides, insecticides, rat poisons, and antibiotics are also examples of natural selection

    Natural Selection can enhance, reduce, or maintain variability

    • Natural selection can, under the right conditions, favor polymorphism (two or more alleles or phenotypes in a population) can result in a Balanced Polymorphism if each phenotype has an environment in which it is most fit form
      • Cepaea snail's (a large land snail) shell banding varies with the background and can hide the snail from bird predation
      • populations are made up of different forms, each form with an environment in which it is the fittest
      • natural selection favors more than one phenotype within a single population here
    • Natural selection can have different effects on a population, which we have divided into three "modes of selection."
      • Disruptive (Diversifying)
        • when the extremes are fittest and intermediates are less fit
        • Can split a population into two phenotypes with few intermediate forms
      • Stabilizing
        • when the fittest individuals are the average, then those with more extreme (larger and smaller) phenotypes are less fit and NS will act to reduce the number of individuals with extreme phenotypes
      • Directional
        • when a new, fitter type originates, the population will move from the older type to the newer type over time

    Natural Selection produces Adaptations

    • Adaptations are those characteristics of organisms that allow one organisms to be more fit than another
    • Populations adapt to environments as natural selection increases the proportion of individuals that have the most fit adaptation
    • All three modes of selection (disruptive, directional, and stabilizing) will produce adaptation (in the case of disruptive, more than one adaptation).

Variation Within Populations

Are all differences among individuals in a population heritable genetic variation?

Phenotypic Variation (Vp) = Environmental (VE) + Genetic (VG)

Genes may have different effects when in different environments

  • many genes are expressed differently when temperature differs
  • expression of many genes depends on genetic environment - what alleles are present at other loci - dominance is a good example of this effect
  • Therefore, we must added a term for gene-by-environment interactions (VG+E)

Phenotypic Variation (Vp) = Environmental (VE) + Genetic (VG) + Interaction (VG+E)

Heritability

  • Proportion of phenotypic variation that is due to genetic variation

h2 = VG  / (VG + VE)

  • Note that the interaction term is not used and, if significant, makes heritability harder to measure and discuss
  • Often estimated through the slope of the line describing the relationship between the measure of a character in offspring versus the mean of the parents (Midparent Mean)

Reaction Norms

  • A reaction norm is the change in phenotype produced by a single genotype in different environments
  • This is a way to quantify Gene x Environment interaction
  • often a scattergram with the phenotypic measure as the y-axis and the different environments (or range if the differences are continuous, like temperature differences) on the x-axis
  • each genotype gets its own line and interactions are revealed when lines are not parallel

Variation Between Populations

We have already discussed the geographic relationship among populations (allopatry, peripatry, parapatry - no sympatry for populations of the same species!!) when discussing speciation

  • Subspecies = Geographic Race
  • Clines form between extremes of populations or between parapatric populations

Adaptive Geographic Variation and Gene Flow

  1. AGV adapts a local population to its specific, local habitat
  2. Gene Flow counteracts AGV by homogenizing gene frequencies in a population or between local populations

Countergradient Variation

  • a plant found in both harsh and benign environment grows slowly in harsh environment and quickly in benign environment
  • experiment - grow seeds from both populations in the benign environment
    • seeds from population in harsh environment grown faster than seeds from population in the benign environment
    • Environmental variation causes a gradient in growth rate
    • Genetic variation produces a counter-gradient in growth rates due to natural selection for faster growing plants in harsh environment
    • But, since the environmental effect is larger, one observes that plant grown more slowly in harsh environment (difference would be even greater without the genetic countergradient)

Character Displacement

Variation among populations of a species as a result of some populations being sympatric with a related competitor species (or within populations in which gene flow is limited by distance and part of the population is sympatric and part is not)

Character is displaced (=altered) by the effect of competition with the related species for resource, not by a change in overall resource availability

(see book for examples)

F-statistics

  • Variation among individuals in a species can be subdivided into within-population and between-population components
  • FST is a measure of the proportion of variation among individuals at a locus due to differences between populations and it ranges from 0 (no difference in allele frequencies) to 1 (different alleles fixed in each population)
  • There are several ways to calculate and/or estimate this and we will examine one here based on a locus with two alleles only (in all populations)
  • To calculate this, it is necessary to know the frequency of the alleles in each population, from which you can calculate the mean (q-bar) and variation in q (VAR).

  • This equation will be 0 if there is no variation among populations (numerator = 0) and 1 when that variation is as large as the product of the average frequencies of the two alleles (1 - q is the frequency of the other allele when only two are present)
Last updated March 1, 2011