- Repeated sequences (present in more than one copy) account for >50% of the human genome.
- The great bulk of repeated sequences consist of copies of nonfunctional transposons.
- There are many duplications of large chromosome regions.
Repetitive sequences account for >50% of the human genome, as seen in Figure 3.24. The repetitive sequences fall into five classes:
- Transposons (either active or inactive) account for the vast majority (45% of the genome). All transposons are found in multiple copies.
- Processed pseudogenes (~3000 in all, account for ~0.1% of total DNA). (These are sequences that arise by insertion of a copy of an mRNA sequence into the genome; see 4.6 Pseudogenes are dead ends of evolution).
- Simple sequence repeats (highly repetitive DNA such as (CA)n account for ~3%).
- Segmental duplications (blocks of 10-300 kb that have been duplicated into a new region) account for ~5%. Only a minority of these duplications are found on the same chromosome; in the other cases, the duplicates are on different chromosomes.
- Tandem repeats form blocks of one type of sequence (especially found at centromeres and telomeres).
The sequence of the human genome emphasizes the importance of transposons. (Transposons have the capacity to replicate themselves and insert into new locations. They may function exclusively as DNA elements [see 16 Transposons] or may have an active form that is RNA [see 17 Retroviruses and retroposons]. Their distribution in the human genome is summarized in Figure 17.18.) Most of the transposons in the human genome are nonfunctional; very few are currently active. However, the high proportion of the genome occupied by these elements indicates that they have played an active role in shaping the genome. One interesting feature is that some present genes originated as transposons, and evolved into their present condition after losing the ability to transpose. Almost 50 genes appear to have originated like this.
Segmental duplication at its simplest involves the tandem duplication of some region within a chromosome (typically because of an aberrant recombination event at meiosis; see 4.7 Unequal crossing-over rearranges gene clusters). In many cases, however, the duplicated regions are on different chromosomes, implying that either there was originally a tandem duplication followed by a translocation of one copy to a new site, or that the duplication arose by some different mechanism altogether. The extreme case of a segmental duplication is when a whole genome is duplicated, in which case the diploid genome initially becomes tetraploid. As the duplicated copies develop differences from one another, the genome may gradually become effectively a diploid again, although homologies between the diverged copies leave evidence of the event. This is especially common in plant genomes. The present state of analysis of the human genome identifies many individual duplicated regions, but does not indicate whether there was a whole genome duplication in the vertebrate lineage.