- Nonrepetitive DNA shows reassociation kinetics expected of unique sequences.
- Repetitive DNA behaves in a reassociation reaction as though many (related or identical) sequences are present in a component, allowing any pair of complementary sequences to reassociate.
- A transposon (transposable element) is a DNA sequence able to insert itself (or a copy of itself) at a new location in the genome, without having any sequence relationship with the target locus.
- Selfish DNA describes sequences that do not contribute to the genotype of the organism but have self-perpetuation within the genome as their sole function.
- The kinetics of DNA reassociation after a genome has been denatured distinguish sequences by their frequency of repetition in the genome.
- Genes are generally coded by sequences in nonrepetitive DNA.
- Larger genomes within a phylum do not contain more genes, but have large amounts of repetitive DNA.
- A large part of repetitive DNA may be made up of transposons.
The general nature of the eukaryotic genome can be assessed by the kinetics of reassociation of denatured DNA. This technique was used extensively before large scale DNA sequencing became possible (for review see 32.1 DNA reassociation kinetics).
Reassociation kinetics identify two general types of genomic sequences (Britten and Davidson, 1971; Davidson and Britten, 1973):
- Nonrepetitive DNA consists of sequences that are unique: there is only one copy in a haploid genome.
- Repetitive DNA describes sequences that are present in more than one copy in each genome.
Repetitive DNA is often divided into two general types:
- Moderately repetitive DNA consists of relatively short sequences that are repeated typically 10-1000× in the genome. The sequences are dispersed throughout the genome, and are responsible for the high degree of secondary structure formation in pre-mRNA, when (inverted) repeats in the introns pair to form duplex regions.
- Highly repetitive DNA consists of very short sequences (typically <100 bp) that are present many thousands of times in the genome, often organized as long tandem repeats (see 4.11 Satellite DNAs often lie in heterochromatin). Neither class represents protein.
A significant part of the moderately repetitive DNA consists of transposons, short sequences of DNA (~1 kb) that have the ability to move to new locations in the genome and/or to make additional copies of themselves (see 16 Transposons and 17 Retroviruses and retroposons). In some higher eukaryotic genomes they may even occupy more than half of the genome (see 3.11 The human genome has fewer genes than expected).
Transposons are sometimes viewed as fitting the concept of selfish DNA, defined as sequences that propagate themselves within a genome, without contributing to the development of the organism. Transposons may sponsor genome rearrangements, and these could confer selective advantages, but it is fair to say that we do not really understand why selective forces do not act against transposons becoming such a large proportion of the genome. Another term that is sometimes used to describe the apparent excess of DNA is junk DNA, meaning genomic sequences without any apparent function. Of course, it is likely that there is a balance in the genome between the generation of new sequences and the elimination of unwanted sequences, and some proportion of DNA that apparently lacks function may be in the process of being eliminated.
The length of the nonrepetitive DNA component tends to increase with overall genome size, as we proceed up to a total genome size ~3 × 109 (characteristic of mammals). Further increase in genome size, however, generally reflects an increase in the amount and proportion of the repetitive components, so that it is rare for an organism to have a nonrepetitive DNA component >2 × 109. The nonrepetitive DNA content of genomes therefore accords better with our sense of the relative complexity of the organism. E. coli has 4.2 × 106 bp, C. elegans increases an order of magnitude to 6.6 × 107 bp, D. melanogaster increases further to ~108 bp, and mammals increase another order of magnitude to ~2 × 109 bp.
What type of DNA corresponds to protein-coding genes? Reassociation kinetics typically show that mRNA is derived from nonrepetitive DNA. The amount of nonrepetitive DNA is therefore a better indication that the total DNA of the coding potential. (However, more detailed analysis based on genomic sequences shows that many exons have related sequences in other exons [see 2.5 Exon sequences are conserved but introns vary]. Such exons evolve by a duplication to give copies that initially are identical, but which then diverge in sequence during evolution.)