KEY CONCEPTS:
- Most genes are uninterrupted in yeasts, but are interrupted in higher eukaryotes.
- Exons are usually short, typically coding for <100 amino acids.
- Introns are short in lower eukaryotes, but range up to several 10s of kb in length in higher eukaryotes.
- The overall length of a gene is determined largely by its introns.
Figure 2.13 shows the overall
organization of genes in yeasts, insects, and mammals. In S.
cerevisiae, the great majority of genes (>96%) are not interrupted, and those that have exons
usually remain reasonably compact. There are virtually no S. cerevisiae
genes with more than 4 exons.
In insects and mammals, the situation is reversed. Only a
few genes have uninterrupted coding sequences (6% in mammals). Insect genes tend
to have a fairly small number of exons, typically fewer than 10. Mammalian genes
are split into more pieces, and some have several 10s of exons. ~50% of
mammalian genes have >10 introns.
Examining the consequences of this type of organization for
the overall size of the gene, we see in Figure 2.14 that
there is a striking difference between yeast and the higher eukaryotes. The
average yeast gene is 1.4 kb long, and very few are longer than 5 kb. The
predominance of interrupted genes in high eukaryotes, however, means that the
gene can be much larger than the unit that codes for protein. Relatively few
genes in flies or mammals are shorter than 2 kb, and many have lengths between 5
kb and 100 kb. The average human gene is 27 kb long (see Figure 3.22).
The switch from largely uninterrupted to largely interrupted
genes occurs in the lower eukaryotes. In fungi (excepting the yeasts), the
majority of genes are interrupted, but they have a relatively small number of
exons (<6) and are fairly short (<5 kb). The switch to long genes occurs within the
higher eukaryotes, and genes become significantly larger in the insects. With
this increase in the length of the gene, the relationship between genome
complexity and organism complexity is lost (see Figure
3.5).
As genome size increases, the tendency is for introns to
become rather large, while exons remain quite small.
Figure 2.15 shows that the exons
coding for stretches of protein tend to be fairly small. In higher eukaryotes,
the average exon codes for ~50 amino acids, and the general distribution fits
well with the idea that genes have evolved by the slow addition of units that
code for small, individual domains of proteins (see 2.9 How did interrupted genes
evolve?). There is no very significant difference in the sizes of exons in
different types of higher eukaryotes, although the distribution is more compact
in vertebrates where there are few exons longer than 200 bp. In yeast, there are
some longer exons that represent uninterrupted genes where the coding sequence
is intact. There is a tendency for exons coding for untranslated 5 and 3 regions to be longer than those that code for
proteins.
Figure 2.16 shows that introns vary
widely in size. In worms and flies, the average intron is not much longer than
the exons. There are no very long introns in worms, but flies contain a
significant proportion. In vertebrates, the size distribution is much wider,
extending from approximately the same length as the exons (<200 bp) to lengths measured in 10s of kbs, and
extending up to 50-60 kb in extreme cases.
Very long genes are the result of very long introns, not the
result of coding for longer products. There is no correlation between gene size
and mRNA size in higher eukaryotes; nor is there a good correlation between gene
size and the number of exons. The size of a gene therefore depends primarily on
the lengths of its individual introns. In mammals, insects, and birds, the
"average" gene is approximately 5× the length of
its mRNA.
No comments:
Post a Comment