The goal for this laboratory exercise is to work through the steps that are used to clone a gene of interest into an expression vector. Cloning simply refers to the process by which a gene of interest is inserted into a vector. The result of this process is to generate a plasmid that enables the transcription of the gene of interest. The ability to control expression of a gene is very useful in molecular biology research. For example, in this exercise you will clone dCas9 into an expression vector. dCas9 is a key component of a CRISPR-based system used to inhibit gene expression (1). In other experiments, cloning is the used to purify a protein of interest or to introduce a gene into an organism. Here we will focus on the steps used in cloning rather than the applications of cloned constructs.
Enzyme-based reactions drive the cellular processes that are critical for life (2). These enzymes are useful tools in research applications such as cloning (3). In more classical approaches to cloning, DNA polymerase is used to amplify a specific sequence from a template. DNA polymerase catalyzes the synthesis of DNA by incorporating nucleoside triphosphates such that the new DNA strand is complementary to the template strand (4,5). The triphosphates are added to the 3' of the growing DNA strand. Through this process, DNA polymerase generates an amplification product, or copy, of the template. The amplification product is the insert that will be cloned into an expression vector. For this, the insert and the vector are prepared using restriction enzymes.
Restriction enzymes cut DNA by disrupting the phosphate bonds that form the sugar-phosphate backbone of a strand of DNA (6,7). Each enzyme is composed of two subunits and functions as a homodimer. Each subunit of the homodimer has an active site that is capable of cleaving DNA. This enables restriction enzymes to cleave both strands of DNA in a double-stranded recognition sequence. By digesting the insert and the vector, compatible ends are generated that promote association between the two DNA molecules.
Lastly, ligase joins the compatible ends by forming a covalent phosphodiester bond between the 3' hydroxyl end of the 'acceptor' DNA strand and the 5' phosphodiester end of the 'donor' DNA strand (8). To do this adenosine monophosphate (AMP) is added to a lysine residue within the active site of DNA ligase, which releases a pyrophosphate. The AMP is then transferred to the 5' phosphate of the donor nucleotide resulting in the formation of a pyrophosphate bond. A phosphodiester bond is formed between the transferred 5' phosphate of the donor nucleotide and the 3' hydroxyl of the 3' acceptor nucleotide.
Schematic showing the steps used to clone an insert into an expression vector.
First, the insert is amplified to generate multiple copies that contain restriction enzyme sites compatible to the expression vector. Next, the insert and the vector are digested with restriction enzyme sites to create compatible ends. Last, the compatible ends of the digested insert and digested vector are ligated together to construct a circular plasmid.
Because cloning at-the-bench can take days, if not weeks, you will practice the steps in silico today using Benchling, an online laboratory notebook program with DNA manipulation tools. To download Benchling use this link!
Step 1: Amplification
The goal of DNA amplification is to generate numerous copies of a specific DNA sequence called the template. To accomplish this, the polymerase chain reaction (PCR) is used. PCR is a three-step process: denaturing, annealing, and extending (9). To amplify DNA, the original DNA segment, or template DNA, is denatured using heat. This separates the DNA strands and allows the primers to anneal to the template. DNA polymerases require short initiating pieces of DNA called primers to build, or copy, DNA. In PCR amplification, forward and reverse primers that bind either side of the template on the non-coding and coding strands of DNA, respectively. Then polymerase extends from the primer to copy the template DNA. After 30 cycles of PCR, there are as many as a billion copies of the template. PCR requires only three components: primers that bind the DNA on either side of the template, dNTPs that are used to make the copies of the template, and a heat-stable polymerase that builds the copies from the template.
Primer design is an important part of PCR amplification. Primers that are too short may lack specificity for the template sequence and amplify the wrong sequence. Longer primers are more energetically favorable due to increased hydrogen bonding between the primer and the template. However, longer primers are more likely to form secondary structures such as hairpins, which prevent the primer from binding to the template. Other important features include G/C content and placement of bases within the primer sequence. Having a G or C base at the 3' end of a primer increases priming efficiency, due to the better binding of a G/C pair compared to an A/T pair. For G/C content, the ideal is 50 +/- 10%, because long stretches of G/C or A/T bases are both difficult to copy. The G/C content also affects the melting temperature. This is the temperature used in PCR at the annealing step.
To amplify a gene of interest from a template, you first need to design primers -- one primer that anneals at the start of the template and a second primer that anneals at the end of the template. Today you will design a 'forward' primer that anneals to the non-coding DNA strand and reads into the front of the gene and a 'reverse' primer that anneals to the coding DNA strand at the end of the gene and reads into the back of the gene. Each primer will consist of two parts: the 'landing sequence' will anneal to the template and the 'flap sequence' will be used to add a restriction enzyme recognition sequence to your insert.
- Find the insert sequence here.
- Open Benchling. On the left panel, click the '+' sign and choose DNA Sequence > Input Raw Sequence.
- Type "insert" for the name, "linear" for the topology, choose a folder to save sequence file, and click "Create."
- Copy and paste the sequence from the above .docx file above.
- Record the size of the dCas9 template in your notebook.
- Because we want to amplify the entire gene, the landing sequence of the forward primer will begin with the first base of the sequence.
- Record the first 20 bases of the dCas9 gene sequence in your notebook.
- We will use Benchling to assess the characteristics of your primer:
- Highlight the primer sequence in Benchling, right click, and select "Forward primer".
- Leave the default settings for now, and click on "Check secondary structure".
- Use the following guidelines to evaluate your primer:
- length: 17-28 bases
- GC Content: 40-60%
- Tm: 60-65 °C
- Check for hairpins, complementation between primers, and repetitive sequences (you can click on "All structures" to look at possible homodimers and hairpins).
- If you primer does not fit the guidelines provided above, try altering the length. Remember that the 5’ end of the landing sequence must not change or you will delete basepairs from your gene.
- When you are satisfied with the landing sequence make an annotation labeled "landing sequence" according to the following instructions.
- Highlight the landing sequence you decided on
- Click Create → Annotation then complete the requested information in the Annotations window.
- Click Save Annotation.
- Now that you have your landing sequence you will add a flap sequence that introduces a restriction enzyme recognition sequence.
- As shown in the schematic of the cloning strategy, you need to add a BglII recognition sequence to your forward primer. Search the NEB list to find the BglII recognition sequence. Record the recognition sequence and the cleavage sites within the sequence.
- Add the recognition sequence for the BglII restriction enzyme to the landing sequence. Consider the direction in which PCR amplification occurs to determine which end of your primer should carry the flap sequence.
- In addition to the recognition sequence, it is important to include a 6 base 'tail' or 'junk' sequence to ensure the restriction enzyme is able to bind and cleave the DNA. Learn more about why this is necessary from NEB. Add any sequence of 6 bases to your primer flap sequence. Carefully consider where this sequence should appear in your primer!
- Record the sequence (5' → 3') of your forward primer in your notebook.
- Use steps 2-5 to design your reverse primer. Please keep the following notes in mind:
- Because you want to amplify the entire gene you should start with the last base of the sequence.
- You will add an XhoI restriction recognition site to your reverse primer.
- Remember that the reverse primer anneals to the coding DNA strand at the end of the gene and reads back into it. Keep this in mind when you add the flap sequence and when you record the sequence (5' → 3') of your primer in your notebook.
- Highlight the entire sequence (of dCas9) and click Copy.
- In the window that opens, click on the Sequence box.
- Paste the sequence into a new Sequence Map window ('+' Create → DNA Sequence → Input Raw Sequence) that depicts the amplification product you would expect if you used your primers in PCR.
- Save this sequence as 'dCas9 PCR product'.
- Be sure to include the restriction enzyme and junk sequence that are added by your primers.
- What is the size of your PCR product? How does this compare to the size of the template you recorded in Step #1.
- Now that you have your amplified dCas9 PCR product, you need to digest with BglII and XhoI to generate 'sticky ends' that will enable you to ligate your insert into the vector.
Step 2: Digestion
Restriction enzymes, cut DNA at specific sequences of bases referred to as recognition sequences or restriction sites. These sequences are usually four or six base pairs long and palindromic, that is, they read the same 5’ to 3’ on the top and bottom strand of DNA. The DNA ends that result from restriction enzyme digests are either 'sticky', meaning that the DNA ends are not the same length and single-stranded DNA overhang is generated after digestion, or 'blunt', meaning that the DNA ends are the same length after digestion. In cloning, sticky ends are typically preferred as sticky ends improve the efficiency of the ligation reaction.
To prepare for the ligation step, you will generate compatible sticky ends on the insert and vector. The compatibility of the ends is determined by which enzymes are selected. Specifically, the bases that make the single-stranded DNA overhangs on the insert must be complementary, or compatible, with the single-stranded DNA overhangs on the vector. This is why specific enzyme sites were added to the insert during amplification.
Above, you amplified your insert. Here you will digest the insert and the vector to create compatible sticky ends that can be ligated together.
- To digest the insert, click on the scissor icon at the far right of your screen.
- A New Digest window will open.
- Enter BglII into the Find Enzyme box.
- Click on BglII and enter XhoI into the Find Enzyme box. Both BglII and XhoI (and no other enzyme) should be listed under the Selected header.
- Click Run Digest.
- What is the size of your digest product? How does this compare to the size of your PCR product?
- Save your insert digest!
- Find the vector sequence here.
- Copy and paste the vector sequence into a new Sequence Map window and save this sequence as 'expression vector'.
- Be sure to select Circular from the Topology dropdown.
- Cloning vectors are engineered to contain a Multiple Cloning Site (MCS). The MCS is a short segment of DNA that encodes several restriction enzyme recognition sites. These restriction enzyme recognition sites are provided for so researchers can clone their genes of interest into a specific location of the vector.
- Using the Annotation function (ribbon symbol on right side), label basepairs 734 to 783 as the MCS.
- Label basepairs 722-733 as ribosomal binding site (RBS).
- Label basepairs 1077 to 1622 as p15A ori. This is the origin of replication.
- Label basepairs 1851 to 2510 as chloramphenicol acetyltransferase (camR). This gene is provides chloramphenicol resistance.
- At the top panel, click Plasmid to see a visual representation of your vector map.
- To 'digest' your vector for cloning, Click on the scissor icon at the far right of your screen.
- A New Digest window will open.
- Enter BamHI into the Find Enzyme box.
- Click on BamHI and enter XhoI into the Find Enzyme box. Both BamHI and XhoI (and no other enzyme) should be listed under the Selected header.
- Click Run Digest.
- Of the two fragments generated in your digest, which is the vector backbone that you will use for cloning? Which is the removed MCS?
- Save your vector digest!
Step 3: Ligation
The efficiency of the reaction is related to type of DNA ends: compatible sticky ends will ligate more efficiently than blunt ends, and non-compatible sticky ends (sticky ends that do not share complementary DNA bases) will not be ligated due to the lack of hydrogen bonding between the bases. To initiate the ligation reaction, hydrogen bonds are formed between the compatible overhangs of the DNA ends. The ligase enzyme then forms a covalent phosphodiester bond that links the insert and vector.
When you complete a ligation at-the-bench, it is important to calculate the amounts of insert and amount of vector you will use in the reaction. This ensures that the correct cloning product is generated in the reaction. Ideally, you should use a 3:1 molar ratio of insert to vector. Before you complete the in-silico ligation, calculate the amount of insert and vector that should be used in this reaction.
Use the following information to calculate the volume of insert and vector needed to prepare a ligation with a 3:1 molar ratio (insert:vector).
- Concentration of insert = 25 ng/uL
- Concentration of expression vector solution = 50 ng/uL
- Molecular weight of a basepair = 660 g/mol
- Sizes, in basepairs, of the insert and vector sequences (this was determined in the exercises above!)
Though there are are different strategies that can be used to complete the ligation calculations, it may be easier to break the math into the following steps:
- Determine the volume of vector that will be used in the ligation reaction.
- Typically, it is best to use 50 - 100 ng of vector.
- Calculate the moles of vector.
- Calculate the moles of insert.
- Remember, this number should be 3-fold more than the moles of vector to accomplish a 3:1 molar ratio.
- Calculate the volume of insert that contains the appropriate moles of insert.
- Be sure to record all of your work for the ligation calculations in your notebook.
- Feel free to take a picture of your hand-written work and embed the image in your notebook.
- Next you will complete this ligation in silico to generate a plasmid map of your pdCas9 plasmid.
- To ligate your dCas9 PCR insert into the expression vector, be sure the vector sequence is in the Sequence Map window.
- Click the Clock icon on the far right to open the History window.
- Under the header Clone Version to New DNA enter pdCas9 in the Name box.
- Select Clone Version.
- Open the dCas9 PCR insert in the Sequence Map window.
- Click the Gear icon at the top of the window and be sure that Cut Sites is checked.
- Select the BglII label on the sequence, then hold the Shift key and select the XhoI label. The insert sequence should be highlighted.
- Click Copy and In the window that opens, click on the Sequence box.
- Go back to the pdCas9 sequence and select the BamHI and XhoI labels as in Step #9.
- Use the keystroke Command + V (for Mac) or Control + V (for PC) to 'ligate' the dCas9 PCR insert into the expression vector, thereby generating pdCas9.
- Annotate the dCas9 PCR insert within the ligation product as above.
- Use the pdCas9 plasmid map to answer the following questions.
- What is the size (in bp) of your ligation product?
- Does your sequence still contain a BamHI recognition sequence? A BglII recognition sequence? Explain.
- Does your sequence still contain a XhoI recognition sequence? Explain.
Using CRISPRi to increase ethanol yield in E. coli MG1655 (coming soon!)
- Students design sgRNA sequences that are specific to gene targets involved in E. coli anaerobic fermentative metabolism to increase ethanol production
- Teaches concepts related to gene expression, cell culturing, and enzymatic assays
- Provides a scaffold in which students can design and test a self-driven research question
- Larson et. al. "CRISPR interference (CRISPRi) for sequence-specific control of gene expression." Nature Methods. (2013) 8: 2180-2196. PMID: 24136345.
- Cooper and Sutherland. "The cell: a molecular approach. 2nd edition." (2000) Oxford University Press. PMID: NBK9921.
- Old and Primrose. "Principles of gene manipulation: an introduction to genetic engineering. 5th edition." (2000) Oxford: Blackwell Scientific. ISBN: 0-632-03712-1.
- Arber and Linn. "DNA modification and restriction." Annual Review of Biochemistry. (1969) 38:467–500. PMID: 4897066.
- Boyer. "DNA restriction and modification mechanisms in bacteria." Annual Review of Microbiology. (1971) 25:153-176. PMID: 4949033.
- Lehman et. al. "Enzymatic synthesis of deoxyribonucleic acid. I. Preparation of substrates and partial purification of an enzyme from Escherichia coli." Journal of Biological Chemistry. (1958) 233:163–70. PMID: 13563462.
- Schachman et. al. "Enzymatic synthesis of deoxyribonucleic acid. VII. Synthesis of a polymer of deoxyadenylate and deoxythymidylate." Journal of Biological Chemistry. (1960) 235:3242–9. PMID: 13747134.
- Lehman. "DNA ligase: structure, mechanism, and function." (1974) Science. 186:790-797. PMID: 4377758.
- Saiki et. al. "Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase." (1988) Science. 239:487-491. PMID: 2448875.