October 14, 2012

Gene duplication is a major force in evolution

  • Duplicated genes may diverge to generate different genes or one copy may become inactive.

Exons behave like modules for building genes that are tried out in the course of evolution in various combinations. At one extreme, an individual exon from one gene may be copied and used in another gene. At the other extreme, an entire gene, including both exons and introns, may be duplicated. In such a case, mutations can accumulate in one copy without attracting the adverse attention of natural selection. This copy may then evolve to a new function; it may become expressed in a different time or place from the first copy, or it may acquire different activities.

Figure 4.2 summarizes our present view of the rates at which these processes occur. There is ~1% probability that a given gene will be included in a duplication in a period of 1 million years. After the gene has duplicated, differences develop as the result of the occurrence of different mutations in each copy. These accumulate at a rate of ~0.1% per million years (see 4.4 Sequence divergence is the basis for the evolutionary clock).
The organism is not likely to need to retain two identical copies of the gene. As differences develop between the duplicated genes, one of two types of event is likely to occur.
  • Both of the genes become necessary. This can happen either because the differences between them generate proteins with different functions, or because they are expressed specifically in different times or places.
  • If this does not happen, one of the genes is likely to be eliminated, because it will by chance gain a deleterious mutation, and there will be no adverse selection to eliminate this copy. Typically this takes ~ 4 million years. In such a situation, it is purely a matter of chance which of the two copies becomes inactive. (This can contribute to incompatibility between different individuals, and ultimately to speciation, if different copies become inactive in different populations.)
Analysis of the human genome sequence shows that ~5% comprises duplications of identifiable segments ranging in length from 10-300 kb (Bailey et al., 2002). These have arisen relatively recently, that is, there has not been sufficient time for divergence between them to eliminate their relationship. They include a proportional share (~6%) of the expressed exons, which shows that the duplications are occurring more or less irrespective of genetic content. The genes in these duplications may be especially interesting because of the implication that they have evolved recently, and therefore could be important for recent evolutionary developments (such as the separation of Man from the monkeys).