- Messenger RNA (mRNA) is the intermediate that represents one strand of a gene coding for protein. Its coding region is related to the protein sequence by the triplet genetic code.
- Transcription describes synthesis of RNA on a DNA template.
- Translation is synthesis of protein on the mRNA template.
- A coding region is a part of the gene that represents a protein sequence.
- The leader of a protein is a short N-terminal sequence responsible for initiating passage into or through a membrane.
- A trailer (3 UTR) is a nontranslated sequence at the 3 end of an mRNA following the termination codon.
- Pre-mRNA is used to describe the nuclear transcript that is processed by modification and splicing to give an mRNA.
- Processing of RNA describes changes that occur after its transcription, including modification of the 5 and 3 ends, internal methylation, splicing, or cleavage.
- RNA splicing is the process of excising the sequences in RNA that correspond to introns, so that the sequences corresponding to exons are connected into a continuous mRNA.
- A prokaryotic gene is expressed by transcription into mRNA and then by translation of the mRNA into protein.
- In eukaryotes, a gene may contain internal regions that are not represented in protein.
- Internal regions are removed from the RNA transcript by RNA splicing to give an mRNA that is colinear with the protein product.
- Each mRNA consists of a nontranslated 5 leader, a coding region, and a nontranslated 3 trailer.
In comparing gene and protein, we are restricted to dealing with the sequence of DNA stretching between the points corresponding to the ends of the protein. However, a gene is not directly translated into protein, but is expressed via the production of a messenger RNA (abbreviated to mRNA), a nucleic acid intermediate actually used to synthesize a protein (as we see in detail in 5 Messenger RNA).
Messenger RNA is synthesized by the same process of complementary base pairing used to replicate DNA, with the important difference that it corresponds to only one strand of the DNA double helix. Figure 1.36 shows that the sequence of messenger RNA is complementary with the sequence of one strand of DNA and is identical (apart from the replacement of T with U) with the other strand of DNA. The convention for writing DNA sequences is that the top strand runs 5→3, with the sequence that is the same as RNA.
The process by which a gene gives rise to a protein is called gene expression. In bacteria, it consists of two stages. The first stage is transcription, when an mRNA copy of one strand of the DNA is produced. The second stage is translation of the mRNA into protein. This is the process by which the sequence of an mRNA is read in triplets to give the series of amino acids that make the corresponding protein.
A messenger RNA includes a sequence of nucleotides that corresponds with the sequence of amino acids in the protein. This part of the nucleic acid is called the coding region. But the messenger RNA includes additional sequences on either end; these sequences do not directly represent protein. The 5 nontranslated region is called the leader, and the 3 nontranslated region is called the trailer.
The gene includes the entire sequence represented in messenger RNA. Sometimes mutations impeding gene function are found in the additional, noncoding regions, confirming the view that these comprise a legitimate part of the genetic unit.
Figure 1.37 illustrates this situation, in which the gene is considered to comprise a continuous stretch of DNA, needed to produce a particular protein. It includes the sequence coding for that protein, but also includes sequences on either side of the coding region.
A bacterium consists of only a single compartment, so transcription and translation occur in the same place, as illustrated in Figure 1.38.
In eukaryotes transcription occurs in the nucleus, but the RNA product must be transported to the cytoplasm in order to be translated. For the simplest eukaryotic genes (just like in bacteria) the transcript RNA is in fact the mRNA. But for more complex genes, the immediate transcript of the gene is a pre-mRNA that requires processing to generate the mature mRNA. The basic stages of gene expression in a eukaryote are outlined in Figure 1.39. This results in a spatial separation between transcription (in the nucleus) and translation (in the cytoplasm).
The most important stage in processing is RNA splicing. Many genes in eukaryotes (and a majority in higher eukaryotes) contain internal regions that do not code for protein. The process of splicing removes these regions from the pre-mRNA to generate an RNA that has a continuous open reading frame (see Figure 2.1). Other processing events that occur at this stage involve the modification of the 5 and 3 ends of the pre-mRNA (see Figure 5.16).
Translation is accomplished by a complex apparatus that includes both protein and RNA components. The actual "machine" that undertakes the process is the ribosome, a large complex that includes some large RNAs (ribosomal RNAs, abbreviated to rRNAs) and many small proteins. The process of recognizing which amino acid corresponds to a particular nucleotide triplet requires an intermediate transfer RNA (abbreviated to tRNA); there is at least one tRNA species for every amino acid. Many ancillary proteins are involved. We describe translation in 5 Messenger RNA, but note for now that the ribosomes are the large structures in Figure 1.38 that move along the mRNA.
The important point to note at this stage is that the process of gene expression involves RNA not only as the essential substrate, but also in providing components of the apparatus. The rRNA and tRNA components are coded by genes and are generated by the process of transcription (just like mRNA, except that there is no subsequent stage of translation).