Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

UNDER CONSTRUCTION

DNA quality control standards ratified for the Bennett Lab in Aug 2020:

  1. All in vitro / PCR-generated DNA aside from vector elements (replication origin, resistance marker) must be sequenced after assembling into a plasmid.
  2. Plasmids need not be sequenced if assembled by a non-polymerase-based assembly method (e.g. Golden Gate) from sequenced DNAs (e.g. "parts). However, diagnostic restriction digests are required, which secondarily verify overall plasmid preparation by examination on a gel.
  3. All plasmids destined for certain use in a publication must be sequenced over elements critical to the claims of the plasmid-encoded functions. This often, but not always, excludes vector elements.

Written by: Shyam Bhakta

Whole-DNA Sequencing

Sequencing of whole plasmids or linear DNAs generally <25 kb. Requires no primers. Practically insensitive to extreme GC/AT content, structures, or repeats, except for homopolymeric repeats (single base stretch), which may be shortened by a one base.

  • Plasmidsaurus $15/sample, results within 2 workdays.
    • 300 ng in 10 µL. Weekly pickup box or FedEx on your own.
    • $15 for ≤25 kb; $50 for 25–125 kb; $100 for 125–300 kb.
  • Primordium Labs: $19/sample, overnight results.

Dideoxynucleotide Chain Terminator "Sanger" Sequencing

Sequencing of a 500–1000 ng plasmid or linear DNA in the region 800 bp after a primer that is provided or selected primer; from the service's list.
Invented in 1977, Sanger sequencing is still the most common sequencing method (still as of 2022).

Sanger sequencing reactions are generally reliable from 25 bp after the primer up to 800 bp, though good reactions can produce 1000 bp of good read. The sequence ≈25 bp downstream of the primer is generally of too poor quality to use, as is the sequence toward the end of the read. To cover a larger sequence, primers for multiple reactions must thus be designed in a way that either the ends of the sequencing reads are bound to overlap (convergent orientation of reads), and/or downstream primers overlap the reads of upstream primer's read (tandem orientation of reads). Quality scores are assigned to each nucleotide read of the chromatogram, so programs that read sequencing traces (.ab1 files) automatically grey out the low-quality regions and highlight mismatches and insertions/deletions. Ab1 files can be downloaded as part of sequencing results and uploaded for alignment to the template in Benchling.

See Lab Orientation page for instructions on Sanger sequencing.

Mechanism: A provided primer binds the sample DNA, is extended by a DNA polymerase in a reaction that contains a small amount fluorophore-labeled 2′,3′-dideoxynucleotide-triphosphates (ddNTPs) in addition to the normal dNTPs. Incorporation of a ddNTPs by the polymerase terminates the chain, as without a 3′ hydroxyl in the final dDNTP, the polymerase cannot extend the replicated DNA chain further. Because the ddNTPs incorporation is random (but still complimentary to the template DNA), a replicated chain of each and every length is produced by the polymerase extension reaction, and each length terminates in a complimentary ddNTP. Since each of the four ddNTPs have a unique fluorophore, fluorometric capillary electrophoresis of the reaction mixture will produce a sequence of fluorescence (a chromatogram) that corresponds to the sequence of the primer extension product from smallest to largest (closest to primer to farthest), which is the sequence of the template DNA.

Benchling Primer3Plus Sanger Sequencing Primer Design

Benchling can run Primer3Plus for automated design of primers according to many parameters. Shyam spent a few hours fiddling with the parameters to get it produce primers like the ones he would design manually to optimally Sanger sequence the middle of long parts. These parameter calculations give good results for the design of sequencing primers with similar spacing and read overlap as to meticulously manually-designed primers , but with that split the target sequencing region into ≈equal portions covered by individual sequencing reads that overlap sufficiently not to leave any sequencing gaps. Primer3 has the added benefit of template specificity checking. It will generate pairs of divergent forward and reverse primers, but often you'll need only need just one of each pair to cover the entire target span with reads optimally spaced. Shyam Bhakta

* Below, this 850 nt value represents your idea of reliable good sequencing read (after the ≈25 nt "junk" lead between the primer 3′ end and beginning of good sequencing trace). This spacing can be adjusted to the sequencing read length you're comfortable with, but be sure to adjust it in all the other parameters it's used here, marked with an * . 

  • Select the target sequencing span; right click on the sequence (not the features), and in the menu click Run Primer3
  • Task: Sequencing
  • Tm Parameters:
    • Algorithm:
      • SantaLucia 1999, default params if Sanger sequencing or using Taq polymerase for a colony PCR using these primers. Or use Q5 params if using with Q5.
      • Modified Breslauer 1986 if using Q5 or Phusion for a colony PCR
    • Click Set to Primer3 Defaults: DNA: 50 nM; Na⁺/K⁺: 50 mM; Mg²⁺: 1.5 mM; dNTP: 0.6 mM
  • Region:
    • Target: Start  x to End  y  spanning the desired sequencing interval."

      These target indices need adjustment if the 5′ and 3′ ends will be within the coverage of existing sequencing primers you have (e.g. AB17/AB18 in the vector/connectors). Instead of simply omitting ~800 bp of the ends, it works better to include them in the equal partitioning portioning of Sanger reads. To do this, select the full target interval (including flanking Golden Gate sites, where applicable), and click Use Selection. Calculate #Results R and Spacing as below. Add this S to the START index and subtract S from the END index:
      x = left boundary index + S
      y = right boundary index – S
      Before you generate primers, you can reduce #Results by 2, or leave it alone and see an optional primer pair between the last two sequencing spans. 
  • Primer:
    These parameters can likely be adjusted to your liking without issue. 
    • GC%: min 30% – opt 50% – max 70%
    • Tm: min 53° – opt 56° – max 63°
    • Size: min 17 nt – opt 20 nt – max 25 nt
    • 3′ GC Clamp: 1
  • Result Generation:
    • # Results: R = ⌈L ÷ 850*⌉.
      #Results  = (target length L) ÷ (850* nt reliable read length), rounded up, not never down.
      Target length y – target sequence indices, or just look at the length of the selection.
  • Sequencing:
    • Spacing:  S = L ÷ R
      Spacing S = target seq length ÷ #Results R. Normally between 575–900 nt. This evenly distributes the number of ideal primer sites across the target length.
    • Interval: 40 nt.
      If you need both primers in a primer pair that the results give you and they are too close to use (3′ ends <50 bp apart, reads may not overlap), then after saving other selected primer pairs, rerun Primer3 with Interval set to 50 or 60 nt to spread apart that primer pair.
    • Lead: 0 nt.
    • Accuracy: 20 nt

Whole-DNA Sequencing

Sequencing of whole plasmids or linear DNAs generally <25 kb. Requires no primers. Practically insensitive to extreme GC/AT content, repeats, or structures.

 

Next-Generation Sequencing (NGS)

[Explanation of NGS? Probably never.]A paper that serves as a primer on outsourced NGS for the beginner: https://pubs.acs.org/doi/full/10.1021/acssynbio.1c00592
The focus on a protein variant library here can be generalized to any library.

Feb 2022 advice from Kshitij Rai (Caleb Bashor Lab).

You seem to have encountered the "intermediate depth" problem that I stumbled upon back in the end of 2020. The issue is that companies purchase HiSeq and NextSeq kits (that give you upwards of 100 million reads) to be able to offer a discounted price on those, since they run it at scale. Genewiz has a really nice setup on the MiniSeq that lets them offer 50,000 reads for $50, with amazingly no price difference for 150 bp – 500 bp amplicons (even though those require the cheapest (75-paired end) and the most expensive (250 paired end) read kits for the MiniSeq). However, no commercial services seemingly offer read depths b/w 1–20 million, that would need to run on a MiSeq. MiSeqs however, are cheap enough machines (cost ~$99k) for labs to be able to purchase them for their own needs. That said, I looked around Houston to find out what the turnaround times and read depths offered from all the major suppliers were, and here are your options - 
  1. Commercial - As already mentioned, most commercial companies don't have a service at this scale. There are a few options however that I could find from a two-week sprint of calling companies and explaining what I wanted to do and why "just doing 800 million reads" was not something I wanted to commit to. Below are tables with the prices and turnaround times for all the companies around Houston that I could find. Notably, Genewiz offers a 20% discount on all NGS services for first time users, so I'd make use of that and submit from a lab member's account who has never sent any NGS samples to Genewiz before, if you choose to go this route.

  2. In house - Having found no "cheap and fast" options above, I ultimately made the switch and decided to sequence in house. I buy the sequencing kits and flow cells from Illumina directly, and then sequence with a lab that has a MiSeq machine (which most PIs are okay with since it's your own sample and kits, just their machine running it). Lots of things to note if you do go this route, particularly that the QC will have to all be done in your hands as well, including making sure you have a single band, the band has the complete Illumina adapter on it, the concentrations are spot on perfect, and you have a decent amount of ΦX174 DNA spiked in to create diversity in your run (I'd be happy to talk more about this, and help design the experiment and libraries if you do choose to do this). Gang Bao's lab has a MiSeq machine, David Zhang's lab had one (though I don't know where it is now ever since he has left), and a number of labs at the med center also have them. Again, most of them would be happy to let you hop on to their machine when its free hopefully. The options at this scale depend on the kit you buy, but here are some prices for a 300 cycle (150 paired end reads) kit:
    • Nano kit v2 - 300 cycles, 2 million reads, for $390
    • Micro kit v2 - 300 cycles, 5-6 million reads, $540
    • Regular kit v2 - 300 cycles, 15 million reads, $1100
    • Regular kit v3 - 500 cycles, 25 million reads, $1200

  3. Shared runs - If you are apprehensive about diving in and buying a full kit and flow cell in house, you could jump in on someone else's run if they have space. I usually sequence on the MiSeq every 2-3 weeks, and usually have space leftover on my flow cell (cases where I only REALLY need like 7 million reads, but am running a 15 million read kit v2 and taking all the reads because why not). I do have a kit v2 in lab right now, and may be able to incorporate your samples in the next run, so let me know if you are interested in doing this and we can talk logistics.

  4. Hack - My personal favorite hack is to leverage high scale read setups that commercial companies and cores have for different purposes (usually genome sequencing, or single cell RNA seq/bulk RNA seq). They don't understand targeted sequencing of a defined region with variability in the middle, as is the case with most syn bio projects. However, since companies run these services at scale, they give you really good prices (Genewiz charges ~$800 for 700 million reads). All you need to do is to get on the phone with these companies and tell them that you will be doing the library prep and that they just need to run it on their sequencer and give you the FastQ file outputs for your "genome" sample, and then prep and give them your sample. These services are very, very cheap, and open up a lot more options. Baylor's sequencing core is one I would really recommend for this purpose. I believe they offer 200 million reads for $350 or something ridiculous like that, and will give you results in ~1 week.
     

    Low - Medium Depth (2-10 million Reads)
    Sr. No.Provider/CompanyRead DepthCostComments
    1Genewiz15 million
1344
  1. $1344Quote requested
    2Wyzer Biosciences4 million
1120
  1. $1120Single end reads
    3Beijing GenomicsNANADo not offer a service at this depth
    4MD Anderson Sequencing Core15 million
1708
  1. $1708 
    5DNA Link Sequencing Lab50 million
249
  1. $249Paired end reads, 100bp max
    6LGC Biosciences50 million
1826
  1. $1826 
    7ABM NANADid not respond to calls or emails

     

    High Depth (200-400 million Reads)
    Sr. No.Provider/CompanyRead DepthCostComments
    1Genewiz350 million
1440
  1. $1440Additional 30% off on first NGS Run
    2Beijing Genomics300 million
700
  1. $700Best price
    3LGC Biosciences200 million
5843
  1. $5843Quote requested
    4MD Anderson Sequencing Core> 100 million
3238
  1. $3238Costs only
1632
  1. $1632 for MD Anderson faculty
    5DNA Link  Quote requested
    6ABMNANADid not respond to calls or emails