KEY CONCEPTS:
- Pseudogenes have no coding function, but they can be recognized by sequence similarities with existing functional genes. They arise by the accumulation of mutations in (formerly) functional genes.
Pseudogenes (Ψ) are defined by
their possession of sequences that are related to those of the functional genes,
but that cannot be translated into a functional protein.
Some pseudogenes have the same general structure as
functional genes, with sequences corresponding to exons and introns in the usual
locations. They may have been rendered inactive by mutations that prevent any or
all of the stages of gene expression. The changes can take the form of
abolishing the signals for initiating transcription, preventing splicing at the
exon-intron junctions, or prematurely terminating translation.
Usually a pseudogene has several deleterious mutations.
Presumably once it ceased to be active, there was no impediment to the
accumulation of further mutations. Pseudogenes that represent inactive versions
of currently active genes have been found in many systems, including globin,
immunoglobulins, and histocompatibility antigens, where they are located in the
vicinity of the gene cluster, often interspersed with the active genes.
A typical example is the rabbit pseudogene, Ψβ2, which has the usual
organization of exons and introns, and is related most closely to the functional
globin gene β1. But it is not functional. Figure 4.10 summarizes the many changes that have occurred
in the pseudogene. The deletion of a base pair at codon 20 of Ψβ2 has caused a
frameshift that would lead to termination shortly after. Several point mutations
have changed later codons representing amino acids that are highly conserved in
the β globins. Neither of the two introns any
longer possesses recognizable boundaries with the exons, so probably the introns
could not be spliced out even if the gene were transcribed. However, there are
no transcripts corresponding to the gene, possibly because there have been
changes in the 5 flanking
region.
Since this list of defects includes mutations potentially
preventing each stage of gene expression, we have no means of telling which
event originally inactivated this gene. However, from the divergence between the
pseudogene and the functional gene, we can estimate when the pseudogene
originated and when its mutations started to accumulate.
If the pseudogene had become inactive as soon as it was
generated by duplication from β1, we should expect
both replacement site and silent site divergence rates to be the same. (They
will be different only if the gene is translated to create selective pressure on
the replacement sites.) But actually there are fewer replacement site
substitutions than silent site substitutions. This suggests that at first (while
the gene was expressed) there was selection against replacement site
substitution. From the relative extents of substitution in the two types of
site, we can calculate that Ψβ2 diverged from β1 ~55
million years ago, remained a functional gene for 22 million years, but has been
a pseudogene for the last 33 million years.
Similar calculations can be made for other pseudogenes. Some
appear to have been active for some time before becoming pseudogenes, but others
appear to have been inactive from the very time of their original generation.
The general point made by the structures of these pseudogenes is that each has
evolved independently during the development of the globin gene cluster in each
species. This reinforces the conclusion that the creation of new genes, followed
by their acceptance as functional duplicates, variation to become new functional
genes, or inactivation as pseudogenes, is a continuing process in the gene
cluster. Most gene families have members that are pseudogenes. Usually the
pseudogenes represent a small minority of the total gene number.
The mouse Ψα3 globin gene has an interesting property: it precisely
lacks both introns. Its sequence can be aligned (allowing for accumulated
mutations) with the α-globin mRNA. The apparent
time of inactivation coincides with the original duplication, which suggests
that the original inactivating event was associated with the loss of
introns.
Inactive genomic sequences that resemble the RNA transcript
are called processed pseudogenes. They originate by insertion at some random
site of a product derived from the RNA, following a retrotransposition event, as
discussed in Retroviruses and retroposons . Their characteristic features are
summarized in Figure 17.19.
If pseudogenes are evolutionary dead ends, simply an
unwanted accompaniment to the rearrangement of functional genes, why are they
still present in the genome? Do they fulfill any function or are they entirely
without purpose, in which case there should be no selective pressure for their
retention?
We should remember that we see those genes that have
survived in present populations. In past times, any number of other pseudogenes
may have been eliminated. This elimination could occur by deletion of the
sequence as a sudden event or by the accretion of mutations to the point where
the pseudogene can no longer be recognized as a member of its original sequence
family (probably the ultimate fate of any pseudogene that is not suddenly
eliminated).
Even relics of evolution can be duplicated. In the β-globin genes of the goat, there are three adult
species, βA , βB ,
and βC (see Figure 4.5).
Each of these has a pseudogene a few kb upstream of it. The pseudogenes are
better related to each other than to the adult β-globin genes; in particular, they share several
inactivating mutations. Also, the adult β-globin
genes are better related to each other than to the pseudogenes. This implies
that an original Ψβ-β structure was itself
duplicated, giving functional β genes (which
diverged further) and two nonfunctional genes (which diverged into the current
pseudogenes).
The mechanisms responsible for gene duplication,
deletion, and rearrangement act on all sequences that are recognized as members
of the cluster, whether or not they are functional. It is left to selection
to discriminate among the products.
By definition, pseudogenes do not code for proteins, and
usually they have no function at all, but in at least one exceptional case, a
pseudogene has a regulatory function. Transcription of a pseudogene inhibits
degradation of the mRNA produced by its homologous active gene (Hirotsune et al., 2003). Probably there is a protein
responsible for this degradation that binds a specific sequence in the mRNA. If
this sequence is also present in the RNA transcribed from the pseudogene, the
effect of the protein will be diluted when the pseudogene is transcribed. It is not clear how common such effects may be, but as
a general rule, we might expect dilution effects of this type to be possible
whenever pseudogenes are transcribed.