KEY TERMS:
- A neutral mutation has no significant effect on evolutionary fitness and usually has no effect on the phenotype.
- Random drift describes the chance fluctuation (without selective pressure) of the levels of two alleles in a population.
- Fixation is the process by which a new allele replaces the allele that was previously predominant in a population.
- Divergence is the percent difference in nucleotide sequence between two related DNA sequences or in amino acid sequences between two proteins.
- Replacement sites in a gene are those at which mutations alter the amino acid that is coded.
- A silent site in a coding region is one where mutation does not change the sequence of the protein.
- The evolutionary clock is defined by the rate at which mutations accumulate in a given gene.
- The sequences of homologous genes in different species vary at replacement sites (where mutation causes amino acid substitutions) and silent sites (where mutation does not affect the protein sequence).
- Mutations accumulate at silent sites ~10× faster than at replacement sites.
- The evolutionary divergence between two proteins is measured by the per cent of positions at which the corresponding amino acids are different.
- Mutations accumulate at a more or less even speed after genes separate, so that the divergence between any pair of globin sequences is proportional to the time since their genes separated.
Most changes in protein sequences occur by small mutations
that accumulate slowly with time. Point mutations and small insertions and
deletions occur by chance, probably with more or less equal probability in all
regions of the genome, except for hotspots at which mutations occur much more
frequently. Most mutations that change the amino acid sequence are deleterious
and will be eliminated by natural selection.
Few mutations are advantageous, but when a rare one occurs,
it is likely to spread through the population, eventually replacing the former
sequence. When a new variant replaces the previous version of the gene, it is
said to have become fixed in the population.
A contentious issue is what proportion of mutational changes
in an amino acid sequence are neutral, that is,
without any effect on the function of the protein, and able therefore to accrue
as the result of random drift and fixation.
The rate at which mutational changes accumulate is a
characteristic of each protein, presumably depending at least in part on its
flexibility with regard to change. Within a species, a protein evolves by
mutational substitution, followed by elimination or fixation within the single
breeding pool. Remember that when we scrutinize the gene pool of a species, we
see only the variants that have survived. When multiple variants are present,
they may be stable (because neither has any selective advantage) or one may in
fact be transient because it is in process of being displaced.
When a species separates into two new species, each now
constitutes an independent pool for evolution. By comparing the corresponding
proteins in two species, we see the differences that have accumulated between
them since the time when their ancestors ceased to interbreed. Some
proteins are highly conserved, showing little or no change from species to
species. This indicates that almost any change is deleterious and therefore
selected against.
The difference between two proteins is expressed as their
divergence, the percent of positions at which the
amino acids are different. The divergence between proteins can be different from
the divergence between the corresponding nucleic acid sequences. The source of
this difference is the representation of each amino acid in a three-base codon,
in which often the third base has no effect on the meaning.
We may divide the nucleotide sequence of a coding region
into potential replacement sites and silent sites:
- At replacement sites, a mutation alters the amino acid that is coded. The effect of the mutation (deleterious, neutral, or advantageous) depends on the result of the amino acid replacement.
- At silent sites, mutation only substitutes one synonym codon for another, so there is no change in the protein. Usually the replacement sites account for 75% of a coding sequence and the silent sites provide 25%.
In addition to the coding sequence, a gene contains
nontranslated regions. Here again, mutations are potentially neutral, apart from
their effects on either secondary structure or (usually rather short) regulatory
signals.
Although silent mutations are neutral with regard to the
protein, they could affect gene expression via the sequence change in RNA. For
example, a change in secondary structure might influence transcription,
processing, or translation. Another possibility is that a change in synonym
codons calls for a different tRNA to respond, influencing the efficiency of
translation.
The mutations in replacement sites should correspond with
the amino acid divergence (determined by the percent of changes in the protein
sequence). A nucleic acid divergence of 0.45% at replacement sites corresponds
to an amino acid divergence of 1% (assuming that the average number of
replacement sites per codon is 2.25). Actually, the measured divergence
underestimates the differences that have occurred during evolution, because of
the occurrence of multiple events at one codon. Usually a correction is made for
this.
To take the example of the human β- and δ-globin chains,
there are 10 differences in 146 residues, a divergence of 6.9%. The DNA sequence
has 31 changes in 441 residues. However, these changes are distributed very
differently in the replacement and silent sites. There are 11 changes in the 330
replacement sites, but 20 changes in only 111 silent sites. This gives
(corrected) rates of divergence of 3.7% in the replacement sites and 32% in the
silent sites, almost an order of magnitude in difference.
The striking difference in the divergence of replacement and
silent sites demonstrates the existence of much greater constraints on
nucleotide positions that influence protein constitution relative to those that
do not. So probably very few of the amino acid changes are neutral.
Suppose we take the rate of mutation at silent sites to
indicate the underlying rate of mutational fixation (this assumes that there is
no selection at all at the silent sites). Then over the period since the β and δ genes diverged,
there should have been changes at 32% of the 330 replacement sites, a total of
105. All but 11 of them have been eliminated, which means that ~90% of the
mutations did not survive.
The divergence between any pair of globin sequences is (more
or less) proportional to the time since they separated. This provides an evolutionary clock that measures the accumulation of
mutations at an apparently even rate during the evolution of a given
protein.
The rate of divergence can be measured as the percent
difference per million years, or as its reciprocal, the unit evolutionary period
(UEP), the time in millions of years that it takes for 1% divergence to develop.
Once the clock has been established by pairwise comparisons between species
(remembering the practical difficulties in establishing the actual time of
speciation), it can be applied to related genes within a species. From
their divergence, we can calculate how much time has passed since the
duplication that generated them.
By comparing the sequences of homologous genes in different
species, the rate of divergence at both replacement and silent sites can be
determined, as plotted in Figure 4.7.
In pairwise comparisons, there is an average divergence of
10% in the replacement sites of either the α- or
β-globin genes of mammals that have been separated
since the mammalian radiation occurred ~85 million years ago. This corresponds
to a replacement divergence rate of 0.12% per million years.
The rate is steady when the comparison is extended to genes
that diverged in the more distant past. For example, the average replacement
divergence between corresponding mammalian and chicken globin genes is 23%.
Relative to a separation ~270 million years ago, this gives a rate of 0.09% per
million years.
Going further back, we can compare the α- with the β-globin genes
within a species. They have been diverging since the individual gene types
separated 500 million years
ago (see Figure 4.6). They have an average replacement
divergence of ~50%, which gives a rate of 0.1% per million years.
The summary of these data in Figure
4.7 shows that replacement divergence in the globin genes has an average
rate of ~0.096% per million years (or a UEP of 10.4). Considering the
uncertainties in estimating the times at which the species diverged, the results
lend good support to the idea that there is a linear clock.
The data on silent site divergence are much less clear. In
every case, it is evident that the silent site divergence is much greater than
the replacement site divergence, by a factor that varies from 2 to 10. But the
spread of silent site divergences in pairwise comparisons is too great to show
whether a clock is applicable (so we must base temporal comparisons on the
replacement sites).
From Figure 4.7, it is clear that the
rate at silent sites is not linear with regard to time. If we assume that
there must be zero divergence at zero years of separation, we see that the
rate of silent site divergence is much greater for the first ~100 million years
of separation. One interpretation is that a fraction of roughly half of the
silent sites is rapidly (within 100 million years) saturated by mutations; this
fraction behaves as neutral sites. The other fraction accumulates mutations more
slowly, at a rate approximately the same as that of the replacement sites; this
fraction identifies sites that are silent with regard to the protein, but that
come under selective pressure for some other reason.
Now we can reverse the calculation of divergence rates to
estimate the times since genes within a species have been apart. The difference
between the human β and δ genes is 3.7% for replacement sites. At a UEP of 10.4,
these genes must have diverged 10.4 × 3.7 = 40
million years ago—about the time of the separation
of the lines leading to New World monkeys, Old World monkeys, great apes, and
man. All of these higher primates have both β and
δ genes, which suggests that the gene divergence
commenced just before this point in evolution.
Proceeding further back, the divergence between the
replacement sites of γ and ε genes is 10%, which corresponds to a time of
separation ~100 million years ago. The separation between embryonic and fetal
globin genes therefore may have just preceded or accompanied the mammalian
radiation.
An evolutionary tree for the human globin genes is
constructed in Figure 4.8. Features that evolved before
the mammalian radiation—such as the separation of
β/δ from γ—should be found in all
mammals. Features that evolved afterward—such as
the separation of β- and δ-globin genes—should be
found in individual lines of mammals.
In each species, there have been comparatively recent
changes in the structures of the clusters, since we see differences in gene
number (one adult β-globin gene in man, two in
mouse) or in type (most often concerning whether there are separate embryonic
and fetal genes).
When sufficient data have been collected on the sequences of
a particular gene, the arguments can be reversed, and comparisons between genes
in different species can be used to assess taxonomic relationships.