KEY CONCEPTS:
- Mouse satellite DNA has evolved by duplication and mutation of a short repeating unit to give a basic repeating unit of 234 bp in which the original half, quarter, and eighth repeats can be recognized.
In the mammals, as typified by various rodents, the
sequences comprising each satellite show appreciable divergence between tandem
repeats. Common short sequences can be recognized by their preponderance among
the oligonucleotide fragments released by chemical or enzymatic treatment.
However, the predominant short sequence usually accounts for only a small
minority of the copies. The other short sequences are related to the predominant
sequence by a variety of substitutions, deletions, and insertions.
But a series of these variants of the short unit can
constitute a longer repeating unit that is itself repeated in tandem with some
variation. So mammalian satellite DNAs are constructed from a hierarchy of
repeating units. These longer repeating units constitute the sequences that
renature in reassociation analysis. They can also be recognized by digestion
with restriction enzymes.
When any satellite DNA is digested with an enzyme that has a
recognition site in its repeating unit, one fragment will be obtained for every
repeating unit in which the site occurs. In fact, when the DNA of a eukaryotic
genome is digested with a restriction enzyme, most of it gives a general smear,
due to the random distribution of cleavage sites. But satellite DNA generates
sharp bands, because a large number of fragments of identical or almost
identical size are created by cleavage at restriction sites that lie a regular
distance apart.
Determining the sequence of satellite DNA can be difficult.
Using the discrete bands generated by restriction cleavage, we can attempt to
obtain a sequence directly. However, if there is appreciable divergence between
individual repeating units, different nucleotides will be present at the same
position in different repeats, so the sequencing gels will be obscure. If the
divergence is not too great—say, within ~2%—it may be possible to determine an average repeating
sequence.
Individual segments of the satellite can be inserted into
plasmids for cloning. A difficulty is that the satellite sequences tend to be
excised from the chimeric plasmid by recombination in the bacterial host.
However, when the cloning succeeds, it is possible to determine the sequence of
the cloned segment unambiguously. While this gives the actual sequence of a
repeating unit or units, we should need to have many individual such sequences
to reconstruct the type of divergence typical of the satellite as a
whole.
By either sequencing approach, the information we can gain
is limited to the distance that can be analyzed on one set of sequence gels. The
repetition of divergent tandem copies makes it impossible to reconstruct longer
sequences by obtaining overlaps between individual restriction fragments.
The satellite DNA of the mouse M. musculus is
cleaved by the enzyme EcoRII into a series of bands, including a predominant
monomeric fragment of 234 bp. This sequence must be repeated with few variations
throughout the 60-70% of the satellite that is cleaved into the monomeric band.
We may analyze this sequence in terms of its successively smaller constituent
repeating units.
Figure 4.22 depicts the sequence in
terms of two half-repeats. By writing the 234 bp sequence so that the first 117
bp are aligned with the second 117 bp, we see that the two halves are quite well
related. They differ at 22 positions, corresponding to 19% divergence. This
means that the current 234 bp repeating unit must have been generated at some
time in the past by duplicating a 117 bp repeating unit, after which differences
accumulated between the duplicates.
Within the 117 bp unit, we can recognize two further
subunits. Each of these is a quarter-repeat relative to the whole satellite. The
four quarter-repeats are aligned in Figure 4.23. The
upper two lines represent the first half-repeat of Figure
4.22; the lower two lines represent the second half-repeat. We see that the
divergence between the four quarter-repeats has increased to 23 out of 58
positions, or 40%. The first three quarter-repeats are somewhat better related,
and a large proportion of the divergence is due to changes in the fourth
quarter-repeat.
Looking within the quarter-repeats, we find that each
consists of two related subunits (one-eighth-repeats), shown as the α and β sequences in Figure 4.24. The α sequences
all have an insertion of a C, and the β sequences
all have an insertion of a trinucleotide, relative to a common consensus
sequence. This suggests that the quarter-repeat originated by the duplication of
a sequence like the consensus sequence, after which changes occurred to generate
the components we now see as α and β. Further changes then took place between tandemly
repeated αβ
sequences to generate the individual quarter- and half-repeats that exist today.
Among the one-eighth-repeats, the present divergence is 19/31 = 61%.
The consensus sequence is analyzed directly in Figure 4.25, which demonstrates that the current satellite
sequence can be treated as derivatives of a 9 bp sequence. We can recognize
three variants of this sequence in the satellite, as indicated at the bottom of
Figure 4.25. If in one of the repeats we take the next
most frequent base at two positions instead of the most frequent, we obtain
three well-related 9 bp sequences.
G A A A A A C G T
G A A A A A T G A
G A A A A A A C T
The origin of the satellite could well lie in an
amplification of one of these three nonamers. The overall consensus sequence of
the present satellite is GAAAAAAGTCT, which is effectively
an amalgam of the three 9 bp repeats.
The average sequence of the monomeric fragment of the mouse
satellite DNA explains its properties. The longest repeating unit of 234 bp is
identified by the restriction cleavage. The unit of reassociation between single
strands of denatured satellite DNA is probably the 117 bp half-repeat, because
the 234 bp fragments can anneal both in register and in half-register (in the
latter case, the first half-repeat of one strand renatures with the second
half-repeat of the other).
So far, we have treated the present satellite as though it
consisted of identical copies of the 234 bp repeating unit. Although this unit
accounts for the majority of the satellite, variants of it also are present.
Some of them are scattered at random throughout the satellite; others are
clustered.
The existence of variants is implied by our description of
the starting material for the sequence analysis as the "monomeric" fragment.
When the satellite is digested by an enzyme that has one cleavage site in the
234 bp sequence, it also generates dimers, trimers, and tetramers relative to
the 234 bp length. They arise when a repeating unit has lost the enzyme cleavage
site as the result of mutation.
The monomeric 234 bp unit is generated when two adjacent
repeats each have the recognition site. A dimer occurs when one unit has lost
the site, a trimer is generated when two adjacent units have lost the site, and
so on. With some restriction enzymes, most of the satellite is cleaved into a
member of this repeating series, as shown in the example of Figure 4.26. The declining number of dimers, trimers, etc.
shows that there is a random distribution of the repeats in which the enzyme's
recognition site has been eliminated by mutation.
Other restriction enzymes show a different type of behavior
with the satellite DNA. They continue to generate the same series of bands. But
they cleave only a small proportion of the DNA, say 5-10%. This implies that a
certain region of the satellite contains a concentration of the repeating units
with this particular restriction site. Presumably the series of repeats in this
domain all are derived from an ancestral variant that possessed this recognition
site (although in the usual way, some members since have lost it by
mutation).
A satellite DNA suffers unequal recombination. This has
additional consequences when there is internal repetition in the repeating unit.
Let us return to our cluster consisting of "ab" repeats. Suppose that the "a"
and "b" components of the repeating unit are themselves sufficiently well
related to pair. Then the two clusters can align in half-register, with
the "a" sequence of one aligned with the "b" sequence of the other. How
frequently this occurs will depend on the closeness of the relationship between
the two halves of the repeating unit. In mouse satellite DNA, reassociation
between the denatured satellite DNA strands in vitro commonly occurs in
the half-register.
When a recombination event occurs out of register, it
changes the length of the repeating units that are involved in the
reaction.
In the upper recombinant cluster, an "ab" unit has been
replaced by an "aab" unit. In the lower cluster, the "ab" unit has been replaced
by a "b" unit.
This type of event explains a feature of the restriction
digest of mouse satellite DNA. Figure 4.25 shows a
fainter series of bands at lengths of ½, 1½, 2½, and 3½ repeating units, in addition to the stronger integral
length repeats. Suppose that in the preceding example, "ab" represents the 234
bp repeat of mouse satellite DNA, generated by cleavage at a site in the "b"
segment. The "a" and "b" segments correspond to the 117 bp half-repeats.
Then in the upper recombinant cluster, the "aab" unit
generates a fragment of 1½ times the usual
repeating length. And in the lower recombinant cluster, the "b" unit generates a
fragment of half of the usual length. (The multiple fragments in the half-repeat
series are generated in the same way as longer fragments in the integral series,
when some repeating units have lost the restriction site by mutation.)
Turning the argument the other way around, the
identification of the half-repeat series on the gel shows that the 234 bp
repeating unit consists of two half-repeats well enough related to pair
sometimes for recombination. Also visible in Figure 4.26
are some rather faint bands corresponding to ¼-
and ¾-spacings. These will be generated in the
same way as the ½-spacings, when recombination
occurs between clusters aligned in a quarter-register. The decreased
relationship between quarter-repeats compared with half-repeats explains the
reduction in frequency of the ¼- and ¾-bands compared with the ½-bands.