Molecular Biology: How many genes are essential?

KEY TERMS:

Redundancy describes the concept that two or more genes may fulfill the same function, so that no single one of them is essential.
Synthetic lethality occurs when two mutations that by themselves are viable, cause lethality when combined.
Synthetic genetic array analysis (SGA) is an automated technique in budding yeast whereby a mutant is crossed to an array of approximately 5000 deletion mutants to determine if the mutants interact to cause a synthetic lethal phenotype.

KEY CONCEPTS:

Not all genes are essential. In yeast and fly, deletions of <50% of the genes have detectable effects.
When two or more genes are redundant, a mutation in any one of them may not have detectable effects.
We do not fully understand the survival in the genome of genes that are apparently dispensable.

Natural selection is the force that ensures that useful genes are retained in the genome. Mutations occur at random, and their most common effect in an open reading frame will be to damage the protein product. An organism with a damaging mutation will be at a disadvantage in evolution, and ultimately the mutation will be eliminated by the competitive failure of organisms carrying it. The frequency of a disadvantageous allele in the population is balanced between the generation of new mutations and the elimination of old mutations. Reversing this argument, whenever we see an intact open reading frame in the genome, we assume that its product plays a useful role in the organism. Natural selection must have prevented mutations from accumulating in the gene. The ultimate fate of a gene that ceases to be useful is to accumulate mutations until it is no longer recognizable.

The maintenance of a gene implies that it confers a selective advantage on the organism. But in the course of evolution, even a small relative advantage may be the subject of natural selection, and a phenotypic defect may not necessarily be immediately detectable as the result of a mutation. However, we should like to know how many genes are actually essential. This means that their absence is lethal to the organism. In the case of diploid organisms, it means of course that the homozygous null mutation is lethal.

We might assume that the proportion of essential genes will decline with increase in genome size, given that larger genomes may have multiple, related copies of particular gene functions. So far this expectation has not been borne out by the data (see Figure 3.9).

One approach to the issue of gene number is to determine the number of essential genes by mutational analysis. If we saturate some specified region of the chromosome with mutations that are lethal, the mutations should map into a number of complementation groups that corresponds to the number of lethal loci in that region. By extrapolating to the genome as a whole, we may calculate the total essential gene number .

In the organism with the smallest known genome (Mycoplasma genitalium), random insertions have detectable effects only in about two thirds of the genes (Hutchison et al., 1999). Similarly, fewer than half of the genes of E. coli appear to be essential. The proportion is even lower in the yeast S. cerevisiae. When insertions were introduced at random into the genome in one early analysis, only 12% were lethal, and another 14% impeded growth. The majority (70%) of the insertions had no effect (Goebl and Petes, 1986). A more systematic survey based on completely deleting each of 5,916 genes (>96% of the identified genes) shows that only 18.7% are essential for growth on a rich medium (that is, when nutrients are fully provided) (Giaever et al., 2002). Figure 3.29 shows that these include genes in all categories. The only notable concentration of defects is in genes coding for products involved in protein synthesis, where ~50% are essential. Of course, this approach underestimates the number of genes that are essential for the yeast to live in the wild, when it is not so well provided with nutrients.

Figure 3.30 summarizes the results of a systematic analysis of the effects of loss of gene function in the worm C. elegans (Kamath et al., 2003). The sequences of individual genes were predicted from the genome sequence, and by targeting an inhibitory RNA against these sequences (see 11.22 RNA interference is related to gene silencing), a large set of worms were made in which one (predicted) gene was prevented from functioning in each worm. Detectable effects on the phenotype were only observed for 10% of these knockouts, suggesting that most genes do not play essential roles.

There is a greater proportion of essential genes (21%) among those worm genes that have counterparts in other eukaryotes, suggesting that widely conserved genes tend to play more basic functions. There is also an increased proportion of essential genes among those that are present in only one copy per haploid genome, compared with those where there are multiple copies of related or identical genes. This suggests that many of the multiple genes might be relatively recent duplications that can substitute for one another's functions,

Extensive analyses of essential gene number in a higher eukaryote have been made in Drosophila through attempts to correlate visible aspects of chromosome structure with the number of functional genetic units. The notion that this might be possible arose originally from the presence of bands in the polytene chromosomes of D. melanogaster. (These chromosomes are found at certain developmental stages and represent an unusually extended physical form, in which a series of bands [more formally called chromomeres] are evident; see 19.10 Polytene chromosomes form bands.) From the early concept that the bands might represent a linear order of genes, we have come to the attempt to correlate the organization of genes with the organization of bands. There are ~5000 bands in the D. melanogaster haploid set; they vary in size over an order of magnitude, but on average there is ~20 kb of DNA per band.

The basic approach is to saturate a chromosomal region with mutations. Usually the mutations are simply collected as lethals, without analyzing the cause of the lethality. Any mutation that is lethal is taken to identify a locus that is essential for the organism. Sometimes mutations cause visible deleterious effects short of lethality, in which case we also count them as identifying an essential locus. When the mutations are placed into complementation groups, the number can be compared with the number of bands in the region, or individual complementation groups may even be assigned to individual bands. The purpose of these experiments has been to determine whether there is a consistent relationship between bands and genes; for example, does every band contain a single gene?

Totaling the analyses that have been carried out over the past 30 years, the number of lethal complementation groups is ~70% of the number of bands. It is an open question whether there is any functional significance to this relationship. But irrespective of the cause, the equivalence gives us a reasonable estimate for the lethal gene number of ~3600. By any measure, the number of lethal loci in Drosophila is significantly less than the total number of genes.

If the proportion of essential human genes is similar to other eukaryotes, we would predict a range of 4000-8000 genes in which mutations would be lethal or produce evidently damaging effects. At the present, 1300 genes have been identified in which mutations cause evident defects. This is a substantial proportion of the expected total, especially in view of the fact that many lethal genes may act so early that we never see their effects. This sort of bias may also explain the results in Figure 3.31, which show that the majority of known genetic defects are due to point mutations (where there is more likely to be at least some residual function of the gene).

How do we explain the survival of genes whose deletion appears to have no effect? The most likely explanation is that the organism has alternative ways of fulfilling the same function. The simplest possibility is that there is redundancy, and that some genes are present in multiple copies. This is certainly true in some cases, in which multiple (related) genes must be knocked out in order to produce an effect. In a slightly more complex scenario, an organism might have two separate pathways capable of providing some activity. Inactivation of either pathway by itself would not be damaging, but the simultaneous occurrence of mutations in genes from both pathways would be deleterious.

Such situations can be tested by combining mutations. In principle, deletions in two genes, neither of which is lethal by itself, are introduced into the same strain. If the double mutant dies, the strain is called a synthetic lethal. This technique has been used to great effect with yeast, where the isolation of double mutants can be automated. The procedure is called synthetic genetic array analysis (SGA). Figure 3.32 summarizes the results of an analysis in which an SGA screen was made for each of 132 viable deletions, by testing whether it could survive in combination with any one of 4,700 viable deletions. Every one of the test genes had at least one partner with which the combination was lethal, and most of the test genes had many such partners; the median is ~25 partners, and the greatest number is shown by one test gene that had 146 lethal partners(Tong et al., 2004). A small proportion (~10%) of the interacting mutant pairs code for proteins that interact physically.

This result goes some way toward explaining the apparent lack of effect of so many deletions. Natural selection will act against these deletions when they find themselves in lethal pairwise combinations. To some degree, the organism has protected itself against the damaging effects of mutations by building in redundancy. However, it pays a price in the form of accumulating the "genetic load" of mutations that are not deleterious in themselves, but that may cause serious problems when combined with other such mutations in future generations. The theory of natural selection would suggest that the loss of the individual genes in such circumstances produces a sufficient disadvantage to maintain the active gene during the course of evolution.

Molecular Biology

October 14, 2012

How many genes are essential?

No comments:

Post a Comment