One of the most important factors in
successful automated DNA
sequencing is proper
primer design. This document describes the steps involved in this
process and the
major pitfalls to avoid.
**** Use a Computer to Design Primers ****
We highly
recommend that a computer be used during primer design in order to
check for certain fatal design flaws. Numerous programs are capable of
performing this analysis. We generally use 'Oligo' (National
Biosciences, Inc, Plymouth MN), a program for the Macintosh that has
produced excellent results in our hands. Two other programs you might
consider are MacVector (Kodak/IBI) and the GCG suite of sequence
analysis programs, but many others are available as well.
Some Basic
Concepts: If you are confused by the strands and primer orientation,
read this.
Sequencing primers must be able to anneal to the target DNA in
a
predictable location
and on a predictable strand. They furthermore must be capable of
extension by Taq
DNA Polymerase.
Some people are confused about how to examine a DNA sequence
to choose
an appropriate
primer sequence. Here are a few things for novices to remember:
- Sequences are always written from 5'
to 3'. This includes the sequence of your
template DNA (if known), the sequence of the vector DNA into which it
is inserted,
and the sequence of proposed primers. Don't ever write a primer
sequence
reversed or you will only confuse yourself and others.
- Polymerase always extends the 3' end
of the primer, and the sequence you will
read will be the same strand (sense or anti-sense) as the primer
itself.
- Thus, if you choose a primer sequence
that you can read in your source sequence (for
example, in the vector), the sequence you will obtain will extend
from the
primer's right (3') end.
- Conversely, if you choose a primer from the strand opposite
to what your 'source'
sequence reads, the resulting sequence will read towards the left.
Here are a couple of
examples:
Suppose you have a vector with the following sequence around the
Multiple Cloning Site
(the 'MCS'):
TTAGCTACTGCTTGATGCTAGTACTACATCTAGTGCTAGATGGATCCGAATTCGCTGATGCTCATATGTTAATAAAGAC ^ ^ | | BamHI EcoRI
If you cloned your DNA of interest between the BamHI and
EcoRI sites,
you could sequence using
the primer 'CTTGATGCTAGTACTACATC' (remember - that's written 5' to 3')
and you'll obtain the following sequence from the Core:
TAGTGCTAGATG[your-insert-'top'-strand-Bam-to-Eco]AATTCGCTGATGC...(etc.)
What if you wanted sequence from the other strand - Eco
to Bam -
instead? In that case,
you need to select some sequence on the right and then reverse-complement
it before requesting the oligo. Picking out some sequence from the
figure above:
CTGATGCTCATATGTTAATA
This is NOT the primer sequence - it is copied verbatim from the above
sequence. In fact,
if you used this sequence for a primer, sequencing would proceed towards
the right,
away from your insert. Instead, reverse-complement that sequence:
TATTAACATATGAGCATCAG
NOW this should produce sequence of the opposite strand:
CGAATT[your-insert-'bottom'-strand-Eco-to-Bam]CATCTAGCACTA...(etc.)
Some fine print: Only rarely does sequencing actually
show the
nucleotides immediately
downstream from the primer. I've taken some didactic license in the
examples above.
|
More Advanced Concepts: How to
Design a Primer that Works.
Generally you are starting with some small amount of known
sequence
that you wish
to extend. Here's how to proceed:
- I. Design primers only from accurate
sequence data.
- Automated sequencing (and in fact any sequencing) has a
finite probablility of producing errors. Sequence obtained too far away
from the primer must be considered questionable. To determine what is
'too far', we strongly suggest that our clients read the memo Interpretation
of Sequencing Chromatograms, which describes how to assess the
validity of data obtained from the ABI sequencers. Select a region for
primer placement where the possibility of sequence error is low.
- II. Restrict your search to regions
that best reflect your goals.
- You may be interested in maximizing the sequence data
obtained, or you may only need to examine the sequence at a very
specific location in the template. Such needs dictate very different
primer placements.
- Maximize sequence obtained while minimizing the
potential for errors:
Generally, you should design the primer as far to the
3' as you can manage so long as you have confidence in the accuracy of
the sequence from which the primer is drawn. Primers on opposite
strands should be placed in staggered fashion as much as possible.
- Targetted sequencing of a specific region:
Position the primer so the desired sequence falls in
the most accurate region of the chromatogram. Sequence data is often
most accurate about 80-150 nucleotides away from the primer. Do not
count on seeing good sequence less than 50 nucleotides away from the
primer or more than 300 nt away (although we often get sequence
starting immediately after the primer, and we often return 700 nt of
accurate sequence).
- III. Locate candidate primers:
- Identify potential sequencing primers that produce stable
base pairing with the template DNA under conditions appropriate for
cycle sequencing. It is strongly suggested that you use a
computer at this step.
Suggested primer characteristics:
- Length should be between 18 and 30 nt, with optimal
being 20-25 nt. (Although we have had some successes with primers
longer than 30 and shorter than 18).
- G-C content of 40-60% is desirable.
- The Tm should be between 55 C and 75 C. Warning: the
old "4 degrees for each G-C, 2 degrees for each A-T" rule works poorly,
especially for oligos shorter that 20 or longer than 25 nt. Instead,
try:
Tm = 81.5 + 16.6* log[Na] + 0.41*(%GC) - 675/length - 0.65*(%formamide) - (%mismatch)
- IV. Discard candidate primers that
show undesirable self-hybridization.
- Primers that can self-hybridize will be unavailable for
hybridization to the template. Generally avoid primers that can form 4
or more consecutive bonds with itself, or 8 or more bonds total.
Example of a marginally problematic primer:
5'-ACGATTCATCGGACAAAGC-3' |||| |||| 3'-CGAAACAGGCTACTTAGCA-5'
This oligo forms a substantially stable dimer with
itself, with four consecutive bonds at two places and a total of eight
inter-strand bonds.
Primers with 3' ends hybridizing even transiently will
become extended due to polymerase action, thus ruining the primer and
generating false bands. Be somewhat more stringent in avoiding 3'
dimers. For example, the following primer self-dimerizes with a perfect
3' hybridization on itself:
5'-CGATAGTGGGATCTAGATCCC-3' |||||||||||||| 3'-CCCTAGATCTAGGGTGATACG-5'
The above oligo is pretty bad, and almost guaranteed to
cause problems. Note that the polymersase will extend the 3' end during
the sequencing reaction, giving very strong sequence ACTATGC. These
bands will appear at the start of your 'real' data as immense peaks,
occluding the correct sequence. Most primer design programs will
correctly spot such self-dimerizing primers, and will warn you to avoid
them.
Note however that no computer program or rule-of-thumb
assessment can accurately predict either success or failure of a
primer. A primer that seems marginal may perform well, while another
that appears to be flawless may not work at all. Avoid obvious
problems, design the best primers you can, but in a pinch if you have
few options, just try a few candidate primers, regardless of potential
flaws.
- V. Verify the site-specificity of
the primer.
- Perform a sequence homology search (e.g. dot-plot homology
comparison) through all known template sequence to check for
alternative priming sites. Discard any primers that display
'significant' tendancy to bind to such sites. We can provide only rough
guidelines as to what is 'significant'. Avoid primers where alternative
sites are present with (1) more than 90% homology to the primary site
or (2) more than 7 consecutive homologous nucleotides at the 3' end or
(3) abundance greater than 5-fold higher than the intended priming
site.
- VI. Choosing among candidate
primers.
- If at this point you have several candidate primers, you
might select one or a few that are more A-T rich at the 3' end. These
tend to be slightly more specific in action, according to some
investigators. You may want to use more than one primer, maximizing the
likelihood of success.
If you have no candidates that survived the criteria
above, then you may be forced to relax the stringency of the selection
requirements. Ultimately, the test of a good primer is only in its use,
and cannot be accurately predicted by these simplistic rules-of-thumb.
|