KEY TERMS:
- Nonallelic genes are two (or more) copies of the same gene that are present at different locations in the genome (contrasted with alleles which are copies of the same gene derived from different parents and present at the same location on the homologous chromosomes).
- Pseudogenes are inactive but stable components of the genome derived by mutation of an ancestral active gene. Usually they are inactive because of mutations that block transcription or translation or both.
- All globin genes are descended by duplication and mutation from an ancestral gene that had three exons.
- The ancestral gene gave rise to myoglobin, leghemoglobin, and α- and β-globins.
- The α- and β-globin genes separated in the period of early vertebrate evolution, after which duplications generated the individual clusters of separate α-like and β-like genes.
- Once a gene has been inactivated by mutation, it may accumulate further mutations and become a pseudogene, which is homologous to the active gene(s) but has no functional role.
The most common type of duplication generates a second copy
of the gene close to the first copy. In some cases, the copies remain
associated, and further duplication may generate a cluster of related genes. The
best characterized example of a gene cluster is presented by the globin genes,
which constitute an ancient gene family, concerned with a function that is
central to the animal kingdom: the transport of oxygen through the
bloodstream.
The major constituent of the red blood cell is the globin
tetramer, associated with its heme (iron-binding) group in the form of
hemoglobin. Functional globin genes in all species have the same general
structure, divided into three exons as shown previously in Figure 2.7. We conclude that all globin genes are derived
from a single ancestral gene; so by tracing the development of individual globin
genes within and between species, we may learn about the mechanisms involved in
the evolution of gene families.
In adult cells, the globin tetramer consists of two
identical α chains and two identical β chains. Embryonic blood cells contain hemoglobin
tetramers that are different from the adult form. Each tetramer contains two
identical α-like chains and two identical β-like chains, each of which is related to the adult
polypeptide and is later replaced by it. This is an example of developmental
control, in which different genes are successively switched on and off to
provide alternative products that fulfill the same function at different
times.
The division of globin chains into α-like and β-like reflects
the organization of the genes. Each type of globin is coded by genes organized
into a single cluster. The structures of the two clusters in the higher primate
genome are illustrated in Figure 4.3.
Stretching over 50 kb, the β
cluster contains five functional genes (ε , two
γ , δ , and β) and one nonfunctional gene (ψβ). The two γ genes differ in their coding sequence in only one
amino acid; the G variant has glycine at position 136, where the A variant has
alanine.
The more compact α cluster
extends over 28 kb and includes one active ζ gene,
one ζ nonfunctional gene, two α genes, two α
nonfunctional genes, and the θ gene of unknown
function. The two α genes code for the same
protein. Two (or more) identical genes present on the same chromosome are
described as nonallelic copies.
The details of the relationship between embryonic and adult
hemoglobins vary with the organism. The human pathway has three stages:
embryonic, fetal, and adult. The distinction between embryonic and adult is
common to mammals, but the number of pre-adult stages varies. In Man, zeta and
alpha are the two α-like chains. Epsilon, gamma,
delta, and beta are the β-like chains. Figure 4.4 shows how yhe chains are expressed at different
stages of development.
In the human pathway, ζ is the
first α-like chain to be expressed, but is soon
replaced by α. In the β-pathway, ε and γ are expressed first, with δ and β replacing them
later. In adults, the α2β2 form provides 97% of the hemoglobin, α2δ2
is ~2%, and ~1% is provided by persistence of the fetal form α2γ2.
What is the significance of the differences between
embryonic and adult globins? The embryonic and fetal forms have a higher
affinity for oxygen. This is necessary in order to obtain oxygen from the
mother's blood. This explains why there is no equivalent in (for example)
chicken, where the embryonic stages occur outside the body (that is, within the
egg).
Functional genes are defined by their expression in RNA, and
ultimately by the proteins for which they code. Nonfunctional genes are defined
as such by their inability to code for proteins; the reasons for inactivity
vary, and the deficiencies may be in transcription or translation (or both).
They are called pseudogenes and given the symbol
ψ.
A similar general organization is found in other vertebrate
globin gene clusters, but details of the types, numbers, and order of genes all
vary, as illustrated in Figure 4.5. Each cluster contains
both embryonic and adult genes. The total lengths of the clusters vary widely.
The longest is found in the goat, where a basic cluster of 4 genes has been
duplicated twice. The distribution of active genes and pseudogenes differs in
each case, illustrating the random nature of the conversion of one copy of a
duplicated gene into the inactive state.
The characterization of these gene clusters makes an
important general point. There may be more members of a gene family, both
functional and nonfunctional, than we would suspect on the basis of protein
analysis. The extra functional genes may represent duplicates that code for
identical polypeptides; or they may be related to known proteins, although
different from them (and presumably expressed only briefly or in low
amounts).
With regard to the question of how much DNA is needed to
code for a particular function, we see that coding for the β-like globins requires a range of 20-120 kb in
different mammals. This is much greater than we would expect just from
scrutinizing the known β-globin proteins or even
considering the individual genes. However, clusters of this type are not common;
most genes are found as individual loci.
From the organization of globin genes in a variety of
species, we should be able to trace the evolution of present globin gene
clusters from a single ancestral globin gene. Our present view of the
evolutionary descent is pictured in Figure 4.6 (for review
see Hardison, 1998).
The leghemoglobin gene of plants, which is related to the
globin genes, may represent the ancestral form. The furthest back that we can
trace a globin gene in modern form is provided by the sequence of the single
chain of mammalian myoglobin, which diverged from the globin line of descent
~800 million years ago. The myoglobin gene has the same organization as globin
genes, so we may take the three-exon structure to represent their common
ancestor.
Some "primitive fish" have only a single type of globin
chain, so they must have diverged from the line of evolution before the
ancestral globin gene was duplicated to give rise to the α and β variants. This
appears to have occurred ~500 million years ago, during the evolution of the
bony fish.
The next stage of evolution is represented by the state of
the globin genes in the frog X. laevis, which has two globin clusters.
However, each cluster contains bothα and
β genes, of both larval and adult types. The
cluster must therefore have evolved by duplication of a linked α-β pair, followed by
divergence between the individual copies. Later the entire cluster was
duplicated.
The amphibians separated from the mammalian/avian line ~350
million years ago, so the separation of the α- and
β-globin genes must have resulted from a
transposition in the mammalian/avian forerunner after this time. This probably
occurred in the period of early vertebrate evolution. Since there are separate
clusters for α and β
globins in both birds and mammals, the α and β genes must have been physically separated before the
mammals and birds diverged from their common ancestor, an event that occurred
probably ~270 million years ago.
Changes have occurred within the separate α and β clusters in more
recent times, as we see from the description of the divergence of the individual
genes in 4.4 Sequence divergence is
the basis for the evolutionary clock.