Difference between revisions of "20.109(S21):M3D1"

Latest revision as of 17:57, 27 April 2021

20.109(S21): Laboratory Fundamentals of Biological Engineering

Spring 2021 schedule FYI Assignments Homework Communication | Accessibility

M1: Antibody engineering M2: Drug discovery M3: Protein engineering

Introduction

Today you will familiarize yourself with the recombinant protein IPC and its constituent parts. The fluorescent component of IPC is an enhanced yellow fluorescent protein (abbreviated EYFP), one of the many derivatives of green fluorescent protein (GFP). GFP is naturally produced by jellyfish and was cloned into other organisms in the early 1990’s. It has since been exploited as a genetically encodable reporter and mutagenized to vary its excitation and emission spectra. The other key component of inverse pericam is the protein calmodulin (CaM), a natural calcium sensor that is present in all eukaryotes. Calmodulin has many ligands that it binds only in the presence of calcium ion, including the peptide fragment M13. This conditional specificity for M13 binding is enabled by the change in confirmation of CaM when bound to calcium.

Schematic of IPC structure and activity. (A) The EYFP gene within IPC is mutated such that the C and N termini are re-organized then flanked by M13 and CaM. (B) In the absence of Ca²⁺, IPC fluoresces yellow and in the presence of Ca²⁺ fluorescence is quenched.

Within inverse pericam, M13 and CaM are located at opposite ends, surrounding a permuted (i.e., rearranged) version of EYFP. In the absence of calcium, this EYFP exhibits strong fluorescence. However, when enough calcium is added to a solution of inverse pericam, CaM and M13 interact, disrupting the conformation and, as a result, the fluorescence of EYFP. The transition from bright to dim fluorescence occurs over a particular concentration range of calcium. The calcium concentration at which binding to CaM occurs (and fluorescence decreases) is referred to as the K_d and determined by the affinity of CaM to calcium. In addition, the interaction between CaM and calcium is impacted by cooperativity. CaM has four calcium binding sites. In cooperativity, the affinity of CaM for calcium is altered by how many calcium ions are already bound to the protein. The mutations you will examine today were designed in an effort to modify the calcium sensor portion of IPC in a manner that is likely to change the affinity and / or cooperativity for calcium ions.

To examine the modification that were made to IPC, we will use several protein analysis tools. Proteins are modular materials that may be described and examined at multiple levels of a structural hierarchy (from primary to quaternary in the classical paradigm). Primary structure refers to a protein’s amino acid sequence, which might reveal a cluster of charged residues or a pattern of alternating polar and nonpolar residues. One cannot predict off-hand the conformation of a protein merely from its linear sequence; however, due to rotational flexibility of bonds and non-covalent interactions between non-adjacent amino acids (as well as covalent disulfide bonds) some structural characteristics can be inferred. Because many proteins have structural motifs in common (e.g., alpha helices and beta sheets at the secondary level, or leucine-rich repeats at the tertiary level), which ultimately arise from the amino acid sequences, databases can be useful for making predictions about proteins with known amino acid sequences but unknown structures.

Protocols

Part 1: Review IPC reference article

Schematics of pericam variants. Representations of Ca²⁺-sensitive reporter constructs. In this module, your research will explore the properties of the inverse pericam construct (boxed in red). Image modified from Figure 1 of Nagai et. al., (2001) Proc. Natl. Acad. Sci. 98:3197.

Previous 109ers generated the data you will analyze in this module by changing specific amino acids in the IPC protein sequence. The goal of this directed mutagenesis approach was to alter the interaction between IPC and calcium such that the affinity / cooperativity was improved. Before we examine the effect of these mutations on the activity of IPC, we will first review how the original IPC sensor was constructed.

With your partner, review the information regarding the development of calcium sensor variants in the paper by Nagai et. al. (attached here). Specifically, you will focus on the development and activity of flash pericam, radiometric pericam, and inverse pericam. The domain structures for these variants are shown in the schematic to the right.

In your laboratory notebook, complete the following:

What is cpGFP? How was this molecule constructed? Why might this molecule be a more useful tool than GFP?
What is cpYFP? How does it differ from cpGFP?
What is calmodulin? What is M13?
- Review the references cited by the authors or use other resources to provide a brief description of each.
What are the critical mutations that were identified in flash pericam, radiometric pericam, and inverse pericam? How do the authors propose that these critical mutations are involved in the activity of the calcium sensor variants?
Why is it important / useful to construct a calcium sensor? Why is it useful / important to construct multiple calcium sensors with differing functionality / activity?

Part 2: Identify IPC sequence features

Open the word document with the IPC sequence (linked here).
- Open SnapGene. From the options, select 'New DNA File...'.
- Copy and paste the sequence from the .docx file above.
- Enter "IPC" for the File Name (in the lower, right corner), select 'linear' for the topology (in the lower, left corner), then click 'OK'.
Label the features listed below.
- M13 peptide: 1-78 bp
- EYFP (C-terminus portion): 91-372 bp
- EYFP (N-terminus portion): 400-831 bp
- CaM: 838-1281 bp
- Linker sequences: 82-90, 373-399, and 832-837 bp
- Refer to the figure shown above, which depicts the IPC construct in schematic form, to assist in your understanding of how the different components of IPC are connected.
To understand how the mutations in the IPC variants might effect calcium binding, it is helpful to identify features within the IPC sequence that are important to the functionality of the protein. For this, it is best to translated protein sequence rather than the gene sequence.
- The amino acids are shown in the SnapGene sequence window below the DNA bases for the coding regions.
To assist in the identification of key features in the IPC sequence, review the information provided in the paper by Zhang et. al. (linked here). In particular, carefully read the following: Abstract, Introduction, and the "Linker and loop flexibility" section in the Results.
In your IPC SnapGene file, label the amino acid residues that comprise the calcium-binding loops in the CaM region of IPC.
- If you get stuck, use the fact that the CaM within inverse pericam is an E103Q mutant, that is, the 103rd residue of calmodulin is Q, to keep yourself oriented.
Consider other regions of CaM that might be important for calcium binding and label in your IPC SnapGene file.
- Perhaps the "Loss of hydrophobic cavities" section in the Results will provide interesting potential targets.

In your laboratory notebook, complete the following:

Upload the labeled IPC gene sequence and the labeled IPC protein sequence.
Examine the four calcium binding loops. Do these loops share any common features? Do any of these loops contain unique features?
What additional regions did you mark as interesting? Why?
Suggest a mutation that you think might impact the activity of IPC (be specific, what amino acid will replace what amino acid?). Do you hypothesize that this mutation will increase or decrease the affinity of calcium binding? Why? Do you hypothesize that this mutation will increase or decrease the cooperativity of calcium binding? Why?

Part 3: Examine IPC structural elements

In the previous section you reviewed primary scientific literature to locate important features in the IPC sequence. Now you will examine 3D representations of CaM to visualize those features more closely.

You will examine the structure of CaM using the Protein Data Bank (PDB) (linked here). In this online database, the structures are organized according to PDB identification codes.
For this exercise, you will look at the calcium-bound form of CaM.
- Enter "1CLL" into the search box at the top right corner of the PDB homepage.
The landing page for the CaM structure includes background information on the source and reference for this protein structure.
In your laboratory notebook, complete the following:
- What method was used to solve this protein structure? Perform a quick search to learn more about this method and provide a brief description.
- At what resolution was the structure solved? Perform a quick search to learn more about this concept and provide a brief description.
- What is the total weight of the structure?
- How many chains are included in this structure?
- Read the abstract for the reference article wherein this structure was first published. What are the features of the calcium-binding domains (lobes) as described by the authors?
Under the structure shown on the left side of the window, click the 'Structure' link. A page showing the 'cartoon' structure of CaM will load. Using the tools to the right of this page you will be able to more closely examine the structure.
First, let's orient ourselves on how to move / manipulate the protein structure.
- Place your cursor over the structure and while pressing down on your mouse / track pad, move the image to view the protein structure from different angles.
- To zoom-in on an area of the protein structure, place your cursor on the area of interest and double-click. When zoomed in single-click on a residue to get a more detailed view of the amino acids that are present in that area. The dotted lines represent bonds or salt bridges that exist between the elements in the amino acids.
- To zoom-out, single-click on the white space in the viewer window.
- To zoom-in or -out more gradually, use two fingers and drag in the up or down direction.
- To identify which amino acid residues are present in each position of the protein, hover your cursor over the protein. A box will appear in the lower right of the viewer window (see example to the right). Though most of the details here can be ignored, the information provided tells you that the highlighted residue is a valine (Val) at position 35 in the amino acid sequence.
In your laboratory notebook, complete the following:
- What secondary structures are present in CaM?
- Compare the description of the features within the lobes provided by the authors to the protein structure. Screen capture a zoomed in view of one of the lobes and label the features.
Next, let's consider the tools provided in the panel on the right of the page.
The contents of the 'Components' tab are listed: Polymer, Ligand, Water, and Ion.
- Polymer refers the larger structures present, such as protein chains, DNA, or RNA.
- Ligand refers to any non-polymer structure, such as ligand binders, ATP, or co-factors that are not single atoms.
- Water refers to water.
- Ion refers to any lone elements that are associated with the structure.
- Use the 'eyeball' icon to the right of the component labels to remove / add the components to the image.
In your laboratory notebook, complete the following:
- Does the CaM structure contain the Components listed? Answer yes or no for each Component type.
- What type of Component is calcium? Include screen shots of a binding loop with and without calcium present.
Click on the 'Density' tab. Though we will not focus much on the details here, the electron density map is the actual data from the x-ray crystallography experiment used to solve the structure.
- Select '2Fo-Fc σ' from the options.
- Click the box to the right of 'Wireframe' such that this feature is activated (toggle to '✓ On').
- Click on a residue within the protein structure. This will zoom-in on that area and also layer a grid, or cage, over the area. The cage represents the electron density data that were captured via x-ray crystallography. The structural features and atoms within the CaM protein were modeled to match the density map, thus providing a best estimate of the protein structure. The resolution is related to how tight this cage is to the solved structure. Though a gross oversimplification, the relationship can be described as such: the fit of the cage to the solved structure is related to the angstrom value achieved via crystallography, the smaller the angstrom the better the resolution and thus the tighter the cage to the solved structure.
Lastly, let's look at how calcium interacts with CaM!
Move the protein structure such that you are able to achieve a clear view of the calcium ion in the first binding loop and double-click on one of the residues in the loop to zoom-in.
- Hint: hover over the amino acid residues to identify the N-terminus based on the residue numbers provided in the box at the lower right of the viewer window.
Single-click on the calcium ion to visualize how it associates with the residues in the loop.
- It may be easier to view the bonds by removing the density information from the structure. To do this, click the 'eyeball' icon to the right of each of the options listed in the 'Density' tab. Alternatively, you can exit the viewer window and re-enter to return the default setting to '☓ Off'.
In your laboratory notebook, complete the following:
- List the amino acids (from N- to C-terminus) that are in the binding loop.
- List the amino acids (from N- to C-terminus) that are shown to interact with calcium in the binding loop. How many bonds are formed with each of the listed amino acids?
- Provide the above information for each of the binding loops present in the CaM structure.
It may also be interesting to consider how the amino acids that are not directly bound to calcium interact as this is important to maintaining the structural integrity of the binding loop.
Identify the isoleucine (Ile) residue at position 27, then single-click to show the relevant binding information.
- To identify which amino acid residues are bound to Ile 27, hover your cursor over the dotted lines. A box will appear in the lower right of the viewer window (see example to the right). As before, most of the details here can be ignored, the information provided tells you that the highlighted bond is a hydrogen bond between the oxygen (O) of Ile 63 and the nitrogen (N) of Ile 27.

In your laboratory notebook, complete the following:

Based on what you learned from the protein structure, revisit the questions answered when examining the sequence for CaM:
- What additional regions might be interesting targets for mutagenesis? Why?
- What additional mutations do you think might impact the activity of IPC (be specific, what amino acid will replace what amino acid?). Do you hypothesize that this mutation will increase or decrease the affinity of calcium binding? Why? Do you hypothesize that this mutation will increase or decrease the cooperativity of calcium binding? Why?

Navigation links

Next day: Identify IPC mutations

Previous day: Complete CETSA experiment and analyze data

Difference between revisions of "20.109(S21):M3D1"

Latest revision as of 17:57, 27 April 2021

Contents

Introduction

Protocols

Part 1: Review IPC reference article

Part 2: Identify IPC sequence features

Part 3: Examine IPC structural elements

Navigation links

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools

@@ Line 5: / Line 5: @@
 ==Introduction==
-Though the theme of Module 3 is protein engineering, today will focus on a few key techniques used in DNA engineering.  Because the sequence of proteins is determined by the sequence of the genes that encode them, learning how to manipulate DNA is an important first step.  Today you will review the cloning steps used to generate a protein expression vector that contains the gene that encodes inverse pericam (IPC), a calcium sensing protein. To generate pRSET-IPC three common DNA engineering techniques were used: PCR amplification, restriction enzyme digestion, and ligation.
+Today you will familiarize yourself with the recombinant protein IPC and its constituent parts. The fluorescent component of IPC is an enhanced yellow fluorescent protein (abbreviated EYFP), one of the many derivatives of green fluorescent protein (GFP). GFP is naturally produced by jellyfish and was cloned into other organisms in the early 1990’s. It has since been exploited as a genetically encodable reporter and mutagenized to vary its excitation and emission spectra. The other key component of inverse pericam is the protein calmodulin (CaM), a natural calcium sensor that is present in all eukaryotes.  Calmodulin has many ligands that it binds only in the presence of calcium ion, including the peptide fragment M13. This conditional specificity for M13 binding is enabled by the change in confirmation of CaM when bound to calcium.
-[[Image:Sp16 M1D1 cloning schematic.png|thumb|center|450px|'''Schematic of pRSET_IPC cloning strategy.'''  First, the IPC insert was PCR amplified to generate multiple copies of the insert that are flanked by restriction enzymes sites.  Next, the amplified insert and the pRSET expression vector were restriction enzyme digested to create compatible ends.  Last, the compatible ends of the digested insert and vector were ligated together to generate pRSET_IPC.]]
+[[Image:Sp16 M1D2 inverse pericam diagram.png|thumb|right|450 px|'''Schematic of IPC structure and activity.''' (A) The EYFP gene within IPC is mutated such that the C and N termini are re-organized then flanked by M13 and CaM. (B) In the absence of Ca<sup>2+</sup>, IPC fluoresces yellow and in the presence of Ca<sup>2+</sup> fluorescence is quenched.]]
-'''Polymerase chain reaction (PCR)'''
+Within inverse pericam, M13 and CaM are located at opposite ends, surrounding a permuted (i.e., rearranged) version  of EYFP. In the absence of calcium, this EYFP exhibits strong fluorescence. However, when enough calcium is added to a solution of inverse pericam, CaM and M13 interact, disrupting the conformation and, as a result, the fluorescence of EYFP. The transition from bright to dim fluorescence occurs over a particular concentration range of calcium. The calcium concentration at which binding to CaM occurs (and fluorescence decreases) is referred to as the ''K<sub>d</sub>'' and determined by the affinity of CaM to calcium.  In addition, the interaction between CaM and calcium is impacted by cooperativity.  CaM has four calcium binding sites.  In cooperativity, the affinity of CaM for calcium is altered by how many calcium ions are already bound to the protein.  The mutations you will examine today were designed in an effort to modify the calcium sensor portion of IPC in a manner that is likely to change the affinity and / or cooperativity for calcium ions.
-The applications of PCR are widespread, from forensics to molecular biology to evolution, but the goal of any PCR is the same: to generate many copies of DNA from a single or a few specific sequence(s) (called the “template”). In addition to the template, PCR requires only three components: primers to bind sequence flanking the target, dNTPs to polymerize, and a heat-stable polymerase to catalyze the synthesis reaction over and over and over.  DNA polymerases require short initiating pieces of DNA called primers to copy DNA. In PCR amplification, forward and reverse primers that target the non-coding and coding strands of DNA, respectively, are separated by a distance equal to the length of the DNA to be copied. To amplify DNA, the original DNA segment, or template DNA, is denatured using heat.  This separates the strands and allows the primers to anneal to the template.  Then polymerase extends from the primer to copy the template DNA.  How many cycles of PCR are required to achieve the desired double-stranded amplification product?
+To examine the modification that were made to IPC, we will use several protein analysis tools. Proteins are modular materials that may be described and examined at multiple levels of a structural hierarchy (from primary to quaternary in the classical paradigm). Primary structure refers to a protein’s amino acid sequence, which might reveal a cluster of charged residues or a pattern of alternating polar and nonpolar residues. One cannot predict off-hand the conformation of a protein merely from its linear sequence; however, due to rotational flexibility of bonds and non-covalent interactions between non-adjacent amino acids (as well as covalent disulfide bonds) some structural characteristics can be inferred.  Because many proteins have structural motifs in common (e.g., alpha helices and beta sheets at the secondary level, or leucine-rich repeats at the tertiary level), which ultimately arise from the amino acid sequences, databases can be useful for making predictions about proteins with known amino acid sequences but unknown structures.
-[[Image:Fa20 M3D2 PCR schematic.png|center|650px|thumb|'''Schematic of PCR amplification.''' PCR amplification results from multiple (typically ~30) cycles of three steps: denaturation, annealing, and extension.]]
-To amplify a specific sequence of DNA, you first need to design primers -- one primer that anneals at the start of the sequence of interest (the 5' end) and a second primer that anneals at the end of the sequence of interest (the 3' end). The primer that anneals at the start of the sequence is referred to as the 'forward' primer.  The forward primer anneals to the non-coding DNA strand and reads toward, or into, the gene of interest.  The 'reverse' primer anneals to the coding DNA strand at the end of the sequence and reads back into the sequence. Primers can also be useful in adding sequence to sequences upon amplification via the polymerase chain reaction. Several features are important to consider when designing primers for PCR. Primers that are too short may lack requisite specificity for the desired sequence, and thus amplify an unrelated sequence. The longer a primer is, the more favorable are its energetics for annealing to the template DNA, due to increased hydrogen bonding. On the other hand, longer primers are more likely to form secondary structures such as hairpins, leading to inefficient template priming. Two other important features are G/C content and placement. Having a G or C base at the end of each primer increases priming efficiency, due to the greater energy of a GC pair compared to an AT pair. The latter decrease the stability of the primer-template complex. Overall G/C content should ideally be 50 +/- 10%, because long stretches of G/C or A/T bases are both difficult to copy. The G/C content also affects the melting temperature. PCR is a three-step process (denature, anneal, extend) and these steps are repeated 20 or more times. After 30 cycles of PCR, there could be as many as a billion copies of the original template sequence.
-'''Restriction enzyme digest'''
-EDIT BASED ON M1D1
-[[Image:Mod1 1 eco ri.jpg|thumb|right|300px|'''Schematic of DNA digestion.''']]
-Restriction endonucleases, also called restriction enzymes, 'cut' or 'digest' DNA at specific sequences of bases. The restriction enzymes are named according to the prokaryotic organism from which they were isolated. For example, the restriction endonuclease ''EcoRI'' (pronounced “echo-are-one”) was originally isolated from ''E. coli'' giving it the “Eco” part of the name. “RI” indicates the particular version on the ''E. coli strain'' (RY13) and the fact that it was the first restriction enzyme isolated from this strain.
-The sequence of DNA that is bound and cleaved by an endonuclease is called the recognition sequence or restriction site. These sequences are usually four or six base pairs long and palindromic, that is, they read the same 5’ to 3’ on the top and bottom strand of DNA. For example, the recognition sequence for ''EcoRI'' is <font face="courier">5’ GAATTC 3’</font> (see figure at right).  ''EcoRI'' cleaves the phosphate backbone of DNA between the G and A of the recognition sequence, which generates overhangs or 'sticky ends' of double-stranded DNA.
-Unlike ''EcoRI'', some other restriction enzymes cut precisely in the middle of the palindromic DNA sequence, thus leaving no overhangs after digestion. The single-stranded overhangs resulting from DNA digestion by enzymes such as ''EcoRI'' are called sticky ends, while double-stranded ends resulting from digestion by enzymes such as ''HaeIII'' are called blunt ends. ''HaeIII'' recognizes <font face="courier">5’ GGCC 3’</font> and upon recognition cuts in the center of the sequence.
-'''Ligation'''
-[[Image:Mod1 3 dnaligatn.jpg|thumb|right|400px|'''Schematic of DNA ligation.''']]
-In a ligation reaction, DNA ends are covalently attached to one another via the ligase enzyme.  The efficiency of the reaction is related to type of DNA ends: compatible sticky ends will ligate more efficiently than blunt ends, and non-compatible sticky ends will not be ligated due to the lack of hydrogen bonding between the basepairs.  To initiate the ligation reaction, hydrogen bonds are formed between the compatible overhangs of DNA fragments.  The ligase enzyme then forms a covalent phosphodiester bond between the 3' hydroxyl end of the 'acceptor' nucleotide and the 5' phosphodiester end of the 'donor' nucleotide.
-The first step in this process is the addition of AMP (adenylation) to a lysine residue within the active site of DNA ligase, which releases a pyrophosphate.  Next, the AMP is transferred to the 5' phosphate of the donor nucleotide resulting in the formation of a pyrophosphate bond. Lastly, a phosphodiester bond is formed between the 5' phosphate of the donor nucleotide and the 3' hydroxyl of the 3' acceptor nucleotide.
 ==Protocols==
-Because DNA engineering at the benchtop can take days, if not weeks, you will clone the expression plasmid in silico today. You can use any DNA manipulation software you choose to complete the protocols, but the instructions provided are for SnapGene. Please note that if you use a different program the Instructors may not be able to assist you.
+===Part 1: Review IPC reference article===
-To use SnapGene software off campus you must log into a VPN connection prior to opening the SnapGene. Here is the link to the [https://ist.mit.edu/cisco-anyconnect VPN download] and [http://kb.mit.edu/confluence/x/6QPn installation instructions]. Also you will need to update the SnapGene license number if you have not opened the application since March. The new license information can be found [http://downloads.mit.edu/released/snapgene/group-name_registration-code.txt here].
+[[Image:Sp16 M1D2 pericam variants.png|thumb|right|500 px|'''Schematics of pericam variants.''' Representations of Ca<sup>2+</sup>-sensitive reporter constructs.  In this module, your research will explore the properties of the inverse pericam construct (boxed in red). Image modified from Figure 1 of Nagai ''et. al.'', (2001) ''Proc. Natl. Acad. Sci.'' 98:3197.]]
+Previous 109ers generated the data you will analyze in this module by changing specific amino acids in the IPC protein sequence. The goal of this directed mutagenesis approach was to alter the interaction between IPC and calcium such that the affinity / cooperativity was improved.  Before we examine the effect of these mutations on the activity of IPC, we will first review how the original IPC sensor was constructed.
-===Part 1: PCR amplification and restriction enzyme digestion of IPC insert===
+With your partner, review the information regarding the development of calcium sensor variants in the paper by Nagai ''et. al.'' (attached [[Media:IPC reference.pdf|here]]).  Specifically, you will focus on the development and activity of flash pericam, radiometric pericam, and inverse pericam.  The domain structures for these variants are shown in the schematic to the right.
-To amplify a specific sequence of DNA, you first need to design primers -- one primer that anneals at the start of the sequence of interest and a second primer that anneals at the end of the sequence of interest.  Today you will design a 'forward' primer that anneals to the non-coding DNA strand and reads toward the IPC gene and a 'reverse' primer that anneals to the coding DNA strand at the end of the IPC gene and reads back into it.  Each primer will consist of two parts:  the 'landing sequence' will anneal to the sequence of interest and the 'flap sequence' will be used to add a restriction enzyme recognition sequence to your IPC insert.
+<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
+*What is cpGFP?  How was this molecule constructed?  Why might this molecule be a more useful tool than GFP?
+*What is cpYFP?  How does it differ from cpGFP?
+*What is calmodulin?  What is M13?
+**Review the references cited by the authors or use other resources to provide a brief description of each.
+*What are the critical mutations that were identified in flash pericam, radiometric pericam, and inverse pericam?  How do the authors propose that these critical mutations are involved in the activity of the calcium sensor variants?
+*Why is it important / useful to construct a calcium sensor?  Why is it useful / important to construct multiple calcium sensors with differing functionality / activity?
-[[Image:Sp16 M1D1 Part1 IPC insert.png|thumb|right|300px]]
+===Part 2: Identify IPC sequence features===
 #Open the word document with the IPC sequence (linked [[Media:Sp16 M1D1 IPC sequence.docx| here]]).
@@ Line 51: / Line 36: @@
 #*Copy and paste the sequence from the .docx file above.
 #*Enter "IPC" for the File Name (in the lower, right corner), select 'linear' for the topology (in the lower, left corner), then click 'OK'.
-#A new window will open with a map of PF3D7_1351100 showing the unique restriction enzyme sites within the sequence.
+#Label the features listed below.
-#In later steps you will generate a map of the IPC insert cloned into the pRSET expression vector.  To make the map more visually useful, create a feature that defines the IPC insert.
+#*M13 peptide:  1-78 bp
-#*Click 'Sequence' from the options at the bottom of the window.
+#*EYFP (C-terminus portion):  91-372 bp
-#*Highlight the entire sequence in the window.
+#*EYFP (N-terminus portion):  400-831 bp
-#*From the toolbar, select 'Features' &rarr; 'Add Feature...'
+#*CaM:  838-1281 bp
-#*In the new window name, type "IPC" into the 'Feature:' box.
+#*Linker sequences:  82-90, 373-399, and 832-837 bp
-#*Select gene from the dropdown in the 'Type:' box and select the right facing arrowhead (this denotes the directionality of the insert).
+#*Refer to the figure shown above, which depicts the IPC construct in schematic form, to assist in your understanding of how the different components of IPC are connected.
-#*Then click 'OK'.
+#To understand how the mutations in the IPC variants might effect calcium binding, it is helpful to identify features within the IPC sequence that are important to the functionality of the protein.  For this, it is best to translated protein sequence rather than the gene sequence.
-#Next you will use the sequence information to design primers that will amplify the IPC insert.
+#*The amino acids are shown in the SnapGene sequence window below the DNA bases for the coding regions.
-#*Because we want to amplify the entire sequence, the landing sequence of the forward primer will begin with the first basepair of the sequence.
+#To assist in the identification of key features in the IPC sequence, review the information provided in the paper by Zhang ''et. al.'' (linked [http://www.nature.com/nsmb/journal/v2/n9/abs/nsb0995-758.html here]).  In particular, carefully read the following: Abstract, Introduction, and the "Linker and loop flexibility" section in the Results.
-#*Record the first 20 basepairs of the IPC gene sequence in your notebook.
+#In your IPC SnapGene file, label the amino acid residues that comprise the calcium-binding loops in the CaM region of IPC.
-#To label the primer sequence, highlight the first 20 basepairs in the IPC insert sequence, then select 'Primers' --> 'Add Primer...' from the toolbar.
+#*If you get stuck, use the fact that the CaM within inverse pericam is an E103Q mutant, that is, the 103rd residue of calmodulin is Q, to keep yourself oriented.
-#*A new window will open asking which strand should be used to make the primer.  Before making your selection consider the direction in which DNA is synthesized and to which strand your primer should anneal such that the IPC insert is amplified during PCR.
+#Consider other regions of CaM that might be important for calcium binding and label in your IPC SnapGene file.
-#*In the 'Primer:' text box, enter a specific name for your forward primer, then select 'Add Primer to Template'.
+#*Perhaps the "Loss of hydrophobic cavities" section in the Results will provide interesting potential targets.
-#The primer should be indicated on the sequence of the IPC insert by an arrow ''facing into the sequence''.
+<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
-#Click 'Primers' from the options at the bottom of the window.
+*Upload the labeled IPC gene sequence and the labeled IPC protein sequence.
-#Use the following guidelines to evaluate your primer:
+*Examine the four calcium binding loops.  Do these loops share any common features?  Do any of these loops contain unique features?
-#**length:  17-28 basepairs
+*What additional regions did you mark as interesting?  Why?
-#**GC Content:  40-60%
+*Suggest a mutation that you think might impact the activity of IPC (be specific, what amino acid will replace what amino acid?).  Do you hypothesize that this mutation will increase or decrease the affinity of calcium binding?  Why?  Do you hypothesize that this mutation will increase or decrease the cooperativity of calcium binding?  Why?
-#**Tm:  60-65 &deg;C
-#**Check for hairpins and complementation between primers by clicking on the name of your primer, then 'Primers' --> 'Analyze Selected Primer...' from the toolbar.  Note: this will automatically open window to the IDT DNA OligoAnalyzer tool.
-#*If your primer does not fit the guidelines provided above, try altering the length. '''Remember''' that the 5’ end of the landing sequence must not change or you will delete basepairs from your gene.
-#*When you are satisfied with the landing sequence, be sure to update the primer labeled on the IPC seqeuence.
-#Now that the landing sequence is defined, you will add a flap sequence that introduces a restriction enzyme recognition sequence.
-#*As shown in the schematic of our cloning strategy, we need to add a BamHI recognition sequence to our forward primer.  Search the [https://www.neb.com/tools-and-resources/selection-charts/alphabetized-list-of-recognition-specificities NEB list] to find the BamHI recognition sequence.  Record the recognition sequence and the cleavage location within the sequence.
-#Add the recognition sequence for the BamHI restriction enzyme to the landing sequence.  Consider the direction in which PCR amplification occurs to determine which end of your primer should contain the BamHI restriction enzyme recognition site.
-#*In the 'Primers' window, click on the name of your primer.  Then select 'Primers' --> 'Edit Primer...' from the toolbar.
-#*Add the recognition sequence by typing into the text box at the top of the window that contains the primer sequence.
-#*For reasons that you will consider later, you must include an extra basepair between the BamHI recognition site and the landing sequence.  Add a "T" at this location in your primer.
-#Lastly, in addition to the recognition sequence, it is important to include a 6 basepair 'tail' or 'junk' sequence to ensure the restriction enzyme is able to bind and cleave the DNA. Learn more about why this is necessary from scientists at [https://www.neb.com/tools-and-resources/usage-guidelines/cleavage-close-to-the-end-of-dna-fragments NEB]. Add any sequence of 6 basepairs to your primer flap sequence.  Carefully consider where this sequence should appear in your primer!
-#Use the above process to design your reverse primer.  Please keep the following notes in mind:
-#*Because you want to amplify the entire gene you should start with the last basepair of the sequence.
-#*Do NOT include a "T" between the enzyme recognition site and the landing sequence for the reverse primer.
-#*You will add an EcoRI restriction recognition site to your reverse primer.
-#*Remember that the reverse primer anneals to the coding DNA strand at the end of the IPC insert and reads back into it.
-#<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
-#*Record the full sequences for your forward and reverse primers.  Indicate which parts of each primer are the landing sequence, flap sequence, and junk sequence.
-#*Record the length, GC content, and Tm for the landing sequences of your forward and reverse primers.  Why is only the landing sequence considered when applying the primer design guidelines?
-#*Are your primers capable of forming hairpins or primer dimers?
-#To generate the PCR amplicon of the IPC insert sequence that would result from amplification using your primers, select 'Actions' --> 'PCR' from the toolbar.
-#*A new window will open, in the text boxes at the bottom select your forward primer (Primer 1) and reverse primer (Primer 2).  Then click 'PCR' and save the amplicon file with a specific name.
-#<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
-#*Record the length of the amplicon.
-#*Is the amplicon double-stranded or single-stranded?  Is it a blunt end product or sticky end product?
-#Now that you have your IPC PCR amplicon, you need to digest with BamHI and EcoRI to generate 'sticky ends' that will enable you to ligate the IPC insert into the pRSET expression vector.
-#*On the map of the IPC PCR amplicon, select the BamHI recognition site by clicking on the enzyme name.  Then hold the shift key and select the EcoRI recognition site.
-#*This should highlight the area between the enzyme recognition sites.
-#Click the drop-down arrow next to the 'Copy' icon at the top of the window.
-#*Select 'Copy Restriction Fragment.'
-#Click the drop-down arrow next to the 'New' icon at the top of the window.
-#*Select 'New DNA File...'.
-#*Paste the restriction fragment from the previous step in the text box, then click 'OK'.
-#A new window will open with the digested IPC insert.
-#Save the insert file.
-#<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
-#*Record the length of the digested insert.  How does the length of the insert compare to the length of the PCR amplicon.
-#*Is the digested insert double-stranded or single-stranded?  Is it a blunt end product or sticky end product?
-===Part 2: Restriction enzyme digest of pRSET expression vector===
+===Part 3: Examine IPC structural elements===
-For the ligation step, it is important to generate compatible 'sticky ends' on the insert and vector.  Above, you digested your IPC insert with BamHI and EcoRI in a double-digest to prepare the insert for your cloning.  Here you will digest the pRSET expression vector to create compatible ends that can be ligated.
+In the previous section you reviewed primary scientific literature to locate important features in the IPC sequence. Now you will examine 3D representations of CaM to visualize those features more closely.
-[[Image:Sp16 M1D1 Part2 pRSET vector.png|thumb|right|250px|]]
+#You will examine the structure of CaM using the Protein Data Bank (PDB) (linked [http://www.pdb.org/pdb/home/home.do here]).  In this online database, the structures are organized according to PDB identification codes.
+#For this exercise, you will look at the calcium-bound form of CaM.
-#Open the word document with the pRSET vector sequence (linked[[Media:Sp16 M1D1 PRSET sequence.docx| here]]).
+#*Enter "1CLL" into the search box at the top right corner of the PDB homepage.
-#*Copy and paste the vector sequence into a New DNA File window and save this sequence.
+#The landing page for the CaM structure includes background information on the source and reference for this protein structure.
-#*Be sure to select circular from the topology options.
-#One very useful aspect of SnapGene is that the software is able to recognize features, or sequences that match known genes and binding sites, in DNA sequences.  A window titled "Detect Common Features" should appear.
 #<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
-#*Include a summary of the details provided about features in the pRSET vector.
+#*What method was used to solve this protein structure?  Perform a quick search to learn more about this method and provide a brief description.
-#Select 'Add Features'.
+#*At what resolution was the structure solved? Perform a quick search to learn more about this concept and provide a brief description.
-#A new window will open with a map of the vector showing the unique restriction enzyme sites and annotated features within the sequence.
+#*What is the total weight of the structure?
-#To generate the sticky ends that will enable you to ligate the IPC insert into the vector, view the map of your vector sequence.
+#*How many chains are included in this structure?
-#*Select the BamHI recognition site by clicking on the enzyme name, then hold the shift key and select the EcoRI recognition site.
+#*Read the abstract for the reference article wherein this structure was first published.  What are the features of the calcium-binding domains (lobes) as described by the authors?
-#*Select 'Actions' --> 'Restriction and Insertion Cloning' --> 'Delete Restriction Fragment...' from the toolbar.
+#Under the structure shown on the left side of the window, click the 'Structure' link.  A page showing the 'cartoon' structure of CaM will load.  Using the tools to the right of this page you will be able to more closely examine the structure.
+#First, let's orient ourselves on how to move / manipulate the protein structure.
+#*Place your cursor over the structure and while pressing down on your mouse / track pad, move the image to view the protein structure from different angles.
+#*To zoom-in on an area of the protein structure, place your cursor on the area of interest and double-click.  When zoomed in single-click on a residue to get a more detailed view of the amino acids that are present in that area.  The dotted lines represent bonds or salt bridges that exist between the elements in the amino acids.
+#*To zoom-out, single-click on the white space in the viewer window.
+#*To zoom-in or -out more gradually, use two fingers and drag in the up or down direction.
+#*[[Image:Sp21 M3D1 residue information.png|thumb|400px|right|]]To identify which amino acid residues are present in each position of the protein, hover your cursor over the protein.  A box will appear in the lower right of the viewer window (see example to the right).  Though most of the details here can be ignored, the information provided tells you that the highlighted residue is a valine (Val) at position 35 in the amino acid sequence.
 #<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
-#*What is the length of the digested vector product?
+#*What secondary structures are present in CaM?
-#*How many basepairs were removed (compared to the intact cloning vector)?
+#*Compare the description of the features within the lobes provided by the authors to the protein structure.  Screen capture a zoomed in view of one of the lobes and label the features.
+#Next, let's consider the tools provided in the panel on the right of the page.
-===Part 3: Ligation of IPC insert and pRSET expression vector===
+#The contents of the 'Components' tab are listed: Polymer, Ligand, Water, and Ion.
+#*Polymer refers the larger structures present, such as protein chains, DNA, or RNA.
-Before you prepare a ligation, one very important step is to calculate the amounts of DNA that will be used in the reaction.  Ideally, you should use a 3:1 molar ratio of insert to vector (note: it is a molar ratio, not a volumetric ratio!). You will use the steps below to calculate the volume amount (based on the molar ratio!) of the IPC insert and pRSET expression vector you would use to complete this ligation in the laboratory.
+#*Ligand refers to any non-polymer structure, such as ligand binders, ATP, or co-factors that are not single atoms.
+#*Water refers to water.
-[[Image:Sp16 M1D1 recovery gel.png|thumb|right|300px|'''Recovery gel for ligation calculations.''' Lane 1 = pRSET vector, Lane 2 = molecular weight ladder, and Lane 3 = IPC insert.]]
+#*Ion refers to any lone elements that are associated with the structure.
+#*Use the 'eyeball' icon to the right of the component labels to remove / add the components to the image.
-Use the following information to calculate the volume of insert and vector needed to prepare a ligation with a 3:1 molar ratio (insert:vector).
-*Concentration of IPC insert solution = 25 ng/uL
-*Concentration of pRSET expression vector solution = 50 ng/uL
-*Molecular weight of a basepair = 660 g/mol
-*Sizes, in basepairs, of the insert and vector sequences (this was determined in the exercises above!)
-Though there are are different strategies that can be used to complete the ligation calculations, it may be easier to break the math into the following steps:
-#Determine the volume of vector that will be used in the ligation reaction.
-#*Typically, it is best to use 50 - 100 ng of vector.
-#Calculate the moles of vector.
-#Calculate the moles of insert.
-#*Remember, this number should be 3-fold more than the moles of vector to accomplish a 3:1 molar ratio.
-#Calculate the volume of insert that contains the appropriate moles of insert.
-#One additional consideration is the volume of the reaction.  The total volume of the ligation reaction should not be greater than 15 &mu;L.  In this, the total volume of the insert and vector should not be greater than 13.5 &mu;L as additional reagents are required in the reaction.
-#*If the insert and vector volume total greater than 13.5 &mu;L, you should (1) scale down both DNA amounts, using less than 50 ng backbone and/or (2) stray from the ideal 3:1 molar ratio.
-#*You may ask the teaching faculty for advice during class if you are unsure what choice is best.
-#<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> calculate the volume of insert and volume of vector that should be used for a ligation reaction that contains a 3:1 molar ratio of insert:vector.  Show all math!
-#*Feel free to take a picture of your hand-written work and embed the image in your notebook.
-#Next you will complete this ligation ''in silico'' to generate a map, or visual representation, of the pRSET_IPC cloning product.[[Image:Sp16 M1D1 Part3 ligation.png|thumb|right|400px|]]
-#To ligate the IPC insert into the pRSET expression vector, select 'Actions' --> 'Restriction and Insertion Cloning' --> 'Insert Fragment...'.
-#*A new window will open.  In the bottom workspace of the window, a cloning schematic will appear showing a vector and insert icon.
-#*Click on the 'Vector' label.  Then in the workspace at the the right of the window, select the vector file from the 'Vector:' drop-down.
-#*Select the restriction enzymes used to digest the expression vector from the drop-down boxes next to the text boxes that contain 'cut'.
-#Next, click on the 'Insert' label at the bottom of the window and complete the steps as done for the expression vector.
-#*For the insert, use the IPC ''undigested'' file.
-#Click 'Clone'.
-#A new window will open with the cloned pRSET_IPC product!
 #<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
-#*What is the size of the plasmid?  Does this make sense given the lengths of the insert and vector?
+#*Does the CaM structure contain the Components listed?  Answer yes or no for each Component type.
-#*Does your sequence still contain a BamHI recognition sequence?  An EcoRI recognition sequence?
+#*What type of Component is calcium?  Include screen shots of a binding loop with and without calcium present.
-#*Why were two different restriction enzymes used in the cloning strategy for pRSET-IPC?
+#Click on the 'Density' tab.  Though we will not focus much on the details here, the electron density map is the actual data from the x-ray crystallography experiment used to solve the structure.
-#*Recall the "T" you added between the landing sequence and BamHI recognition sequence in your forward primer.  What was the purpose of this extra base?  Is it important that a "T" was added?  Could another base be added instead?
+#*Select '2Fo-Fc &sigma;' from the options.
-#**Hint: When the IPC insert is cloned into the pRSET vector, you want to ensure the His tag sequence is attached to the IPC sequence when IPC is transcribed from pRSET-IPC.  This is result in the translated IPC protein containing a His tag which will be used for protein purification.  Think about the spacing between the His tag (CATCATCATCATCATCAT) and the first codon of the IPC gene in your plasmid map.
+#*Click the box to the right of 'Wireframe' such that this feature is activated (toggle to '&#10003; On').
-#*Why was an extra base not added to the reverse primer?
+#*Click on a residue within the protein structure.  This will zoom-in on that area and also layer a grid, or cage, over the area.  The cage represents the electron density data that were captured via x-ray crystallography.  The structural features and atoms within the CaM protein were modeled to match the density map, thus providing a best estimate of the protein structure. The resolution is related to how tight this cage is to the solved structure.  Though a gross oversimplification, the relationship can be described as such: the fit of the cage to the solved structure is related to the angstrom value achieved via crystallography, the smaller the angstrom the better the resolution and thus the tighter the cage to the solved structure.
+#Lastly, let's look at how calcium interacts with CaM!
-===Part 4: Confirmation digest of pRSET_IPC===
+#Move the protein structure such that you are able to achieve a clear view of the calcium ion in the first binding loop and double-click on one of the residues in the loop to zoom-in.
+#*Hint: hover over the amino acid residues to identify the N-terminus based on the residue numbers provided in the box at the lower right of the viewer window.
-To confirm the pRSET_IPC construct that we will use for this module, you will perform a 'diagnostic' or 'confirmation' digest. As discussed in prelab, this step is an important control -- you want to be sure that the products you use in your research are correct! This  step is used to check products you clone yourself and, perhaps more importantly, those that you may receive from another researcher.
+#Single-click on the calcium ion to visualize how it associates with the residues in the loop.
+#*It may be easier to view the bonds by removing the density information from the structure.  To do this, click the 'eyeball' icon to the right of each of the options listed in the 'Density' tab.  Alternatively, you can exit the viewer window and re-enter to return the default setting to '&#9747; Off'.
-Ideally you will use a single enzyme that cuts once within the vector and once within your insert.  Unfortunately, this is rarely an option and you instead need to select an enzyme that cuts once within the vector and a second, compatible enzyme that cuts once within the insert.  Enzyme compatibility is determined by the buffer.  If two enzymes are active, or able to cleave DNA, in the same buffer, they are compatible.  The [http://nebcloner.neb.com/#!/redigest NEB double digest online tool] will prove very helpful in identifying compatible enzyme combinations!
-Use the information from prelab, the 20.109 list of enzymes (linked [[media:20109Enzymes.docx |here]]), and the plasmid map you generated above to choose the enzymes you will use.
-#To choose restriction enzymes for your confirmation digest, look at the plasmid map for your pRSET_IPC construct.
-#*Identify possible sites that will enable to you confirm the pRSET_IPC sequence.
-#*Remember the guidelines discussed in prelab!
-#After you identify the enzymes that you will use for the confirmation digest, complete a virtual digest in using the pRSET_IPC map you generated above.
-#*On the map of pRSET_IPC, select the first recognition site by clicking on the enzyme name.  Then hold the shift key and select the second recognition site.
-#*Select 'Tools' --> 'Simulate Agarose Gel' from the toolbar.
 #<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
-#*Record the expected fragment sizes from the confirmation digest.
+#*List the amino acids (from N- to C-terminus) that are in the binding loop.
-#*Are the fragments distinct or ambiguously close together?
+#*List the amino acids (from N- to C-terminus) that are shown to interact with calcium in the binding loop.  How many bonds are formed with each of the listed amino acids?
-#Now that you identified which enzyme(s) to use in your confirmation digest, consider which controls should be included to ensure the results are interpretable.
+#*Provide the above information for each of the binding loops present in the CaM structure.
-#<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> explain why the following reactions are included as controls for the confirmation digest experiment:
+#It may also be interesting to consider how the amino acids that are not directly bound to calcium interact as this is important to maintaining the structural integrity of the binding loop.
-#*Undigested pRSET_IPC.
+#Identify the isoleucine (Ile) residue at position 27, then single-click to show the relevant binding information.
-#*Single digests of pRSET_IPC (each enzyme used alone in a digest with pRSET_IPC).
+#*[[Image:Sp21 M3D1 binding information.png|thumb|400px|right|]]To identify which amino acid residues are bound to Ile 27, hover your cursor over the dotted lines.  A box will appear in the lower right of the viewer window (see example to the right).  As before, most of the details here can be ignored, the information provided tells you that the highlighted bond is a hydrogen bond between the oxygen (O) of Ile 63 and the nitrogen (N) of Ile 27.
-#Use the table below to calculate the volumes of each reagent that should be included in the confirmation digest reactions.
-#*The 20.109 enzyme stocks are always the "S" size and concentration when you search for them on the NEB website.
-#*To find the concentration of the enzyme(s) you choose, search the [http://www.neb.com/products/restriction-endonucleases/restriction-endonucleases NEB site].
-<center>
-{| border="1"
-|
-! Diagnostic digest <br>(enzyme #1 AND enzyme #2)
-! Enzyme #1 ONLY
-! Enzyme #2 ONLY
-! Uncut <br>(NO enzyme)
-|-
-| pRSET_IPC
-| 5 &mu;L
-| 5 &mu;L
-| 5 &mu;L
-| 5 &mu;L
-|-
-| 10X NEB buffer <br>
-(buffer name:____________)
-| 2.5 &mu;L
-| 2.5 &mu;L
-| 2.5 &mu;L
-| 2.5 &mu;L
-|-
-| Enzyme #1 <br>
-(enzyme name:____________)
-| ____ &mu;L
-| ____ &mu;L
-|
-|
-|-
-| Enzyme #2 <br>
-(enzyme name:____________)
-| ____ &mu;L
-|
-| ____ &mu;L
-|
-|-
-| H<sub>2</sub>O
-! colspan="4"| to a final volume of 25 &mu;L
-|}
-</center>
-<font color = #0d368e>'''To ensure the steps required for preparing a digest are clear, the Instructor will provide a live demonstration of this process.  You should provide a written description of the procedure in your laboratory notebook!'''</font color>
 <font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
+*Based on what you learned from the protein structure, revisit the questions answered when examining the sequence for CaM:
-*Provide a written overview / description of the the procedure used to prepare a restriction enzyme digest (from the live demonstration).
+**What additional regions might be interesting targets for mutagenesis?  Why?
-*For how long will the digests incubate and at what temperature?
+**What additional mutations do you think might impact the activity of IPC (be specific, what amino acid will replace what amino acid?).  Do you hypothesize that this mutation will increase or decrease the affinity of calcium binding?  Why?  Do you hypothesize that this mutation will increase or decrease the cooperativity of calcium binding?  Why?
-Following a restriction enzyme digestion reaction, the DNA fragments are separated using gel electrophoresis.  To review this method, look back at the information provided on [[20.109(S21):M1D1#Part_3:_Gel_purify_PCR_products| M1D1]]!
-==Reagents list==
-*pRSET_IPC (concentration = 25 ng/&mu;L) (a gift from the Jasanoff Laboratory)
-*10X buffer; the buffer will depend on the enzymes you use for your confirmation digest (from NEB)
-*restriction enzyme(s); the concentration of each enzyme is listed on the product information page (from NEB)
-*1% agarose in 1X TAE (agarose from VWR)
-**with 10% (v/v) &mu;L SYBR Safe DNA stain (from Invitrogen)
-*1X TAE gel electrophoresis buffer: 40 mm Tris, 20 mM acetic acid, 1 mM EDTA (from BioRad)
-*6X gel loading dye, blue (from NEB)
-*1 kb DNA ladder (from NEB)
 ==Navigation links==
-Next day: [[20.109(S21):M3D2 |Examine IPC mutations ]] <br>
+Next day: [[20.109(S21):M3D2 |Identify IPC mutations]] <br>
 Previous day: [[20.109(S21):M2D7 |Complete CETSA experiment and analyze data]] <br>