KEY TERMS:
- Concerted evolution describes the ability of two related genes to evolve together as though constituting a single locus.
- Coincidental evolution (Coevolution) describes a situation in which two genes evolve together as a single unit.
- Gene conversion is the alteration of one strand of a heteroduplex DNA to make it complementary with the other strand at any position(s) where there were mispaired bases.
- Crossover fixation refers to a possible consequence of unequal crossing-over that allows a mutation in one member of a tandem cluster to spread through the whole cluster (or to be eliminated).
- Unequal crossing-over changes the size of a cluster of tandem repeats.
- Individual repeating units can be eliminated or can spread through the cluster.
The same problem is encountered whenever a gene has been
duplicated. How can selection be imposed to prevent the accumulation of
deleterious mutations?
The duplication of a gene is likely to result in an
immediate relaxation of the evolutionary pressure on its sequence. Now that
there are two identical copies, a change in the sequence of either one will not
deprive the organism of a functional protein, since the original amino acid
sequence continues to be coded by the other copy. Then the selective pressure on
the two genes is diffused, until one of them mutates sufficiently away from its
original function to refocus all the selective pressure on the other.
Immediately following a gene duplication, changes might
accumulate more rapidly in one of the copies, leading eventually to a new
function (or to its disuse in the form of a pseudogene). If a new function
develops, the gene then evolves at the same, slower rate characteristic of the
original function. Probably this is the sort of mechanism responsible for the
separation of functions between embryonic and adult globin genes.
Yet there are instances where duplicated genes retain the
same function, coding for the identical or nearly identical proteins. Identical
proteins are coded by the two human α-globin
genes, and there is only a single amino acid difference between the two γ-globin proteins. How is selective pressure exerted to
maintain their sequence identity?
The most obvious possibility is that the two genes do not
actually have identical functions, but differ in some (undetected) property,
such as time or place of expression. Another possibility is that the need for
two copies is quantitative, because neither by itself produces a sufficient
amount of protein.
In more extreme cases of repetition, however, it is
impossible to avoid the conclusion that no single copy of the gene is essential.
When there are many copies of a gene, the immediate effects of mutation in any
one copy must be very slight. The consequences of an individual mutation are
diluted by the large number of copies of the gene that retain the wild-type
sequence. Many mutant copies could accumulate before a lethal effect is
generated.
Lethality becomes quantitative, a conclusion reinforced by
the observation that half of the units of the rDNA cluster of X. laevis
or D. melanogaster can be deleted without ill effect. So how are these
units prevented from gradually accumulating deleterious mutations? And what
chance is there for the rare favorable mutation to display its advantages in the
cluster?
The basic principle of models to explain the maintenance of
identity among repeated copies is to suppose that nonallelic genes are not
independently inherited, but must be continually regenerated from one
of the copies of a preceding generation. In the simplest case of two identical
genes, when a mutation occurs in one copy, either it is by chance eliminated
(because the sequence of the other copy takes over), or it is spread to both
duplicates (because the mutant copy becomes the dominant version). Spreading
exposes a mutation to selection. The result is that the two genes evolve
together as though only a single locus existed. This is called coincidental evolution or concerted
evolution (occasionally coevolution). It can
be applied to a pair of identical genes or (with further assumptions) to a
cluster containing many genes.
One mechanism supposes that the sequences of the nonallelic
genes are directly compared with one another and homogenized by enzymes that
recognize any differences. This can be done by exchanging single strands between
them, to form genes one of whose strands derives from one copy, one from the
other copy. Any differences show as improperly paired bases, which attract
attention from enzymes able to excise and replace a base, so that only A·T and G·C pairs survive.
This type of event is called gene conversion and is
associated with genetic recombination as described in 15 Recombination and repair.
We should be able to ascertain the scope of such events by
comparing the sequences of duplicate genes. If they are subject to concerted
evolution, we should not see the accumulation of silent site substitutions
between them (because the homogenization process applies to these as well as to
the replacement sites). We know that the extent of the maintenance mechanism
need not extend beyond the gene itself, since there are cases of duplicate genes
whose flanking sequences are entirely different. Indeed, we may see abrupt
boundaries that mark the ends of the sequences that were homogenized.
We must remember that the existence of such mechanisms can
invalidate the determination of the history of such genes via their divergence,
because the divergence reflects only the time since the last
homogenization/regeneration event, not the original duplication.
The crossover fixation model
supposes that an entire cluster is subject to continual rearrangement by the
mechanism of unequal crossing-over. Such events can explain the concerted
evolution of multiple genes if unequal crossing-over causes all the copies to be
regenerated physically from one copy.
Following the sort of event depicted in , for example, the
chromosome carrying a triple locus could suffer deletion of one of the genes. Of
the two remaining genes, 1½ represent the sequence
of one of the original copies; only ½ of the
sequence of the other original copy has survived. Any mutation in the first
region now exists in both genes and is subject to selective pressure.
Tandem clustering provides frequent opportunities for
"mispairing" of genes whose sequences are the same, but that lie in different
positions in their clusters. By continually expanding and contracting the number
of units via unequal crossing-over, it is possible for all the units in one
cluster to be derived from rather a small proportion of those in an ancestral
cluster. The variable lengths of the spacers are consistent with the idea that
unequal crossing-over events take place in spacers that are internally
mispaired. This can explain the homogeneity of the genes compared with the
variability of the spacers. The genes are exposed to selection when individual
repeating units are amplified within the cluster; but the spacers are irrelevant
and can accumulate changes.
In a region of nonrepetitive DNA, recombination occurs
between precisely matching points on the two homologous chromosomes, generating
reciprocal recombinants. The basis for this precision is the ability of two
duplex DNA sequences to align exactly. We know that unequal recombination can
occur when there are multiple copies of genes whose exons are related, even
though their flanking and intervening sequences may differ. This happens because
of the mispairing between corresponding exons in nonallelic
genes.
Imagine how much more frequently misalignment must occur in
a tandem cluster of identical or nearly identical repeats. Except at the very
ends of the cluster, the close relationship between successive repeats makes it
impossible even to define the exactly corresponding repeats! This has two
consequences: there is continual adjustment of the size of the cluster; and
there is homogenization of the repeating unit.
Consider a sequence consisting of a repeating unit "ab" with
ends "x" and "y." If we represent one chromosome in black and the other in
color, the exact alignment between "allelic" sequences would be:
But probably any sequence ab in one chromosome could pair with
any sequence ab in the other chromosome. In a misalignment
such as:
the region of pairing is no less stable than in the
perfectly aligned pair, although it is shorter. We do not know very much about
how pairing is initiated prior to recombination, but very likely it starts
between short corresponding regions and then spreads. If it starts within
satellite DNA, it is more likely than not to involve repeating units that do not
have exactly corresponding locations in their clusters.
Now suppose that a recombination event occurs within the
unevenly paired region. The recombinants will have different numbers of
repeating units. In one case, the cluster has become longer; in the other, it
has become shorter,
where "× " indicates the site
of the crossover.
If this type of event is common, clusters of tandem repeats
will undergo continual expansion and contraction. This can cause a particular
repeating unit to spread through the cluster, as illustrated in Figure 4.18. Suppose that the cluster consists initially of a
sequence abcde, where each letter represents a repeating unit. The
different repeating units are closely enough related to one another to mispair
for recombination. Then by a series of unequal recombination events, the size of
the repetitive region increases or decreases, and also one unit spreads to
replace all the others.
The crossover fixation model predicts that any sequence
of DNA that is not under selective pressure will be taken over by a series of
identical tandem repeats generated in this way (for review see Charlesworth, Sniegowski, and Stephan, 1994). The
critical assumption is that the process of crossover fixation is fairly rapid
relative to mutation, so that new mutations either are eliminated (their repeats
are lost) or come to take over the entire cluster. In the case of the rDNA
cluster, of course, a further factor is imposed by selection for an effective
transcribed sequence.