KEY TERMS:
- The genetic code is the correspondence between triplets in DNA (or RNA) and amino acids in protein.
- A codon is a triplet of nucleotides that represents an amino acid or a termination signal.
- Frameshift mutations arise by deletions or insertions that are not a multiple of 3 base pairs and change the frame in which triplets are translated into protein. The term is inappropriate outside of coding sequences.
- Acridines are mutagens that act on DNA to cause the insertion or deletion of a single base pair. They were useful in defining the triplet nature of the genetic code.
- A suppressor is a second mutation that compensates for or alters the effects of a primary mutation.
- A frameshift suppressor is an insertion or deletion of a base that restores the original reading frame in a gene that has had a base deletion or insertion.
- The genetic code is read in triplet nucleotides called codons.
- The triplets are nonoverlapping and are read from a fixed starting point.
- Mutations that insert or delete individual bases cause a shift in the triplet sets after the site of mutation.
- Combinations of mutations that together insert or delete 3 bases (or multiples of three) insert or delete amino acids but do not change the reading of the triplets beyond the last site of mutation.
Each gene represents a particular protein chain. The concept
that each protein consists of a particular series of amino acids dates from
Sanger's characterization of insulin in the 1950s. The discovery that a gene
consists of DNA faces us with the issue of how a sequence of nucleotides in DNA
represents a sequence of amino acids in protein.
A crucial feature of the general structure of DNA is that
it is independent of the particular sequence of its component
nucleotides. The sequence of nucleotides in DNA is important not because of
its structure per se, but because it codes for the sequence of
amino acids that constitutes the corresponding polypeptide. The relationship
between a sequence of DNA and the sequence of the corresponding protein is
called the genetic code.
The structure and/or enzymatic activity of each protein
follows from its primary sequence of amino acids. By determining the sequence of
amino acids in each protein, the gene is able to carry all the information
needed to specify an active polypeptide chain. In this way, a single type of
structure—the gene—is able to represent itself in innumerable polypeptide
forms.
Together the various protein products of a cell undertake
the catalytic and structural activities that are responsible for establishing
its phenotype. Of course, in addition to sequences that code for proteins, DNA
also contains certain sequences whose function is to be recognized by regulator
molecules, usually proteins. Here the function of the DNA is determined by its
sequence directly, not via any intermediary code. Both types of region, genes
expressed as proteins and sequences recognized as such, constitute genetic
information.
The genetic code is deciphered by a complex apparatus that
interprets the nucleic acid sequence. This apparatus is essential if the
information carried in DNA is to have meaning. In any given region, only one of
the two strands of DNA codes for protein, so we write the genetic code as a
sequence of bases (rather than base pairs).
The genetic code is read in groups of three nucleotides,
each group representing one amino acid. Each trinucleotide sequence is called a
codon. A gene includes a series of codons that is
read sequentially from a starting point at one end to a termination point at the
other end. Written in the conventional 5→3
direction, the nucleotide sequence of the DNA strand that codes for protein
corresponds to the amino acid sequence of the protein written in the direction
from N-terminus to C-terminus.
The genetic code is read in nonoverlapping triplets from
a fixed starting point:
- Nonoverlapping implies that each codon consists of three nucleotides and that successive codons are represented by successive trinucleotides.
- The use of a fixed starting point means that assembly of a protein must start at one end and work to the other, so that different parts of the coding sequence cannot be read independently.
The nature of the code predicts that two types of mutations
will have different effects. If a particular sequence is read sequentially, such
as:
UUU AAA GGG CCC (codons)
aa1 aa2 aa3 aa4 (amino acids)
then a point mutation will affect only one amino acid. For
example, the substitution of an A by some other base (X) causes aa2 to be
replaced by aa5:
UUU AAX GGG CCC
aa1 aa5 aa3 aa4
because only the second codon has been changed.
But a mutation that inserts or deletes a single base
will change the triplet sets for the entire subsequent sequence. A change
of this sort is called a frameshift. An insertion
might take the form:
UUU AAX AGG GCC C
aa1 aa5 aa6 aa7
Because the new sequence of triplets is completely different
from the old one, the entire amino acid sequence of the protein is altered
beyond the site of mutation. So the function of the protein is likely to be lost
completely.
Frameshift mutations are induced by the acridines, compounds that bind to DNA and distort the
structure of the double helix, causing additional bases to be incorporated or
omitted during replication. Each mutagenic event sponsored by an acridine
results in the addition or removal of a single base pair (for review see Roth, 1974).
If an acridine mutant is produced by, say, addition of a
nucleotide, it should revert to wild type by deletion of the nucleotide. But
reversion can also be caused by deletion of a different base, at a site close to
the first. Combinations of such mutations provided revealing evidence about the
nature of the genetic code.
Figure 1.33 illustrates the properties
of frameshift mutations. An insertion or a deletion changes the entire protein
sequence following the site of mutation. But the combination of an insertion and
a deletion causes the code to be read incorrectly only between the two sites of
mutation; correct reading resumes after the second site.
Genetic analysis of acridine mutations in the rII
region of the phage T6 in 1961 showed that all the mutations could be classified
into one of two sets, described as (+) and (–).
Either type of mutation by itself causes a frameshift, the (+) type by virtue of
a base addition, the (–) type by virtue of a base
deletion. Double mutant combinations of the types (+ +) and (––) continue to show
mutant behavior. But combinations of the types (+ –) or (– +) suppress one
another, giving rise to a description in which one mutation is described as a
supressor of the other. (In the context of this
work, "suppressor" is used in an unusual sense, because the second mutation is
in the same gene as the first.)
These results show that the genetic code must be read as a
sequence that is fixed by the starting point, so additions or deletions
compensate for each other, whereas double additions or double deletions remain
mutant. But this does not reveal how many nucleotides make up each
codon.
When triple mutants are constructed, only (+ + +) and (––– ) combinations show the wild phenotype, while other
combinations remain mutant. If we take three additions or three deletions to
correspond respectively to the addition or omission overall of a single amino
acid, this implies that the code is read in triplets. An incorrect amino acid
sequence is found between the two outside sites of mutation, and the sequence on
either side remains wild type, as indicated in Figure 1.33
(Benzer and Champe, 1961; Crick et al., 1961).
Right away I am going away to do my breakfast, later than having
ReplyDeletemy breakfast coming again to read more news.
my page best book reviews, ,