Difference between revisions of "Assignment 9 Overview"

From Course Wiki
Jump to: navigation, search
(Assignment details)
(Assignment details)
 
(12 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
[[Category:Optical Microscopy Lab]]
 
[[Category:Optical Microscopy Lab]]
 
{{Template:20.309}}
 
{{Template:20.309}}
 
 
__NOTOC__
 
__NOTOC__
 
 
==Introduction==
 
==Introduction==
  
Line 21: Line 19:
 
* binding kinetics of the dye
 
* binding kinetics of the dye
  
The goal for Assignment 9 is to write a model for ''V<sub>f,measured</sub>'' that takes these effects into account and use nonlinear regression to estimate the parameters of this function. The model proposed here adheres to Dr. [http://en.wikiquote.org/wiki/George_E._P._Box George E. P. Box's excellent advice] on modeling, in that it is both wrong and useful. Some of the assumptions are more dubious than others. The following reading will guide you through the concepts behind the model which will be useful when you sit down to write your code.
+
The goal for Assignment 9 is to write a model for ''V<sub>f,measured</sub>'' that takes these effects into account and use nonlinear regression to estimate the parameters of this function. The model proposed here adheres to Dr. [http://en.wikiquote.org/wiki/George_E._P._Box George E. P. Box's excellent advice] on modeling, in that it is both wrong and useful. Some of the assumptions are more dubious than others. You might ask: "why don't we just fit the DNA melting curve to a higher order polynomial?" - great question. We are developing a ''mechanistic model'', which means that we hope the fit parameters will give us some insight into the physical processes behind the DNA melting system. Fitting an arbitrary function may be useful to interpolate the data, but provides no physical insights.
  
==Sample temperature==
+
Onward!
  
The DNA melter measures the temperature of the metal heating block, not of the sample. Because the glass vial has high thermal resistance, the sample temperature significantly lags the heating block temperature. A simple, lumped-element circuit model can predict the sample temperature to within about 0.5&deg;C based on the block temperature history.
+
==Assignment details==
  
The state variables in a thermal circuit model are temperature ''T'' and heat flow ''q''. A fundamental assumption of lumped element models is that each element has a single value of both state variables. Because of this assumption, lumped element models only work well if the bodies you are modeling have a nearly uniform temperature. Components that are small in size, have high thermal conductivity, and have low heat transfer coefficients are most suitable for lumped element modeling. These attributes are summarized by the Biot number, <i>Bi</i>=<sup>''hl''</sup>/<sub>''k''</sub>, where ''h'' is the heat transfer coefficient, ''l'' is the characteristic length of the body (volume divided by surface area), and ''k'' is the thermal conductivity of the material. Bodies with Biot numbers less than 1 are good candidates for lumped element models. The Biot number for the heating block is on the order of 6x10<sup>-3</sup>, assuming ''h''=50 <sup>W</sup>/<sub>m</sub>2<sub> K</sub>, ''l''=3 cm, and ''k''=235 <sup>W</sup>/<sub>m K</sub>. This is much less than one, so the heating block is a good candidate for a lumped element model. The Biot number for the glass vial is about 0.03 (''h''=10.5 <sup>W</sup>/<sub>m</sub>2<sub> K</sub>, ''l''=3 mm, and ''k''=1.05 <sup>W</sup>/<sub>m K</sub>).
+
In this assignment you will write the code to analyze your DNA melting data in three parts:
 +
# In Part 1, [[Assignment 9, Part 1: model function|you will define the functions for each phenomenon and combine them into a single fit function]];
 +
# In Part 2, [[Assignment 9, Part 2: Simulating DNA melting data and testing the model function|you will create some simulated data to verify your code and test your model]].
 +
# In Part 3, [[Assignment 9, Part 3: Fitting your data| you will use the code on your real data and think about the statistical model you'll use to identify your unknown sample]].
  
The DNA melter measures the heating block's temperature directly. The heating block has a much higher thermal capacity than the sample, so a temperature source is a good model for the block.
 
  
The model neglects the heat capacity of the vial.
+
{| cellpadding="5" style="background:#FFCCCC; border:2px dashed #ff0000"
 
+
[[Image:Circuit model of heating block and sample.png|thumb|right|300px|Lumped element model of thermal cycler.]]
+
The thermal circuit model for the heating system is shown in the figure on the right. The circuit is a first-order, low-pass filter. In this circuit,  a step increase in block temperature causes the sample temperature to rise exponentially to a peak value of ''K<sub>thermal</sub>'' with time constant ''&tau;<sub>thermal</sub>''.
+
 
+
The transfer function of the low-pass filter has two parameters:
+
 
+
::<math>\frac{\hat{T}_{sample}(\omega)} {\hat{T}_{block}(\omega)}=\frac{K_{thermal}}{j \omega \tau_{thermal} + 1}</math>.
+
 
+
* ''K<sub>thermal</sub>'' is the low frequency gain of the system.
+
* ''&tau;<sub>thermal</sub>'' is the time constant.
+
 
+
{|
+
 
|-
 
|-
|[[Image:Corrected and measured temperature.png|thumb|300px|right|Block temperature, measured sample temperature, and model sample temperature. Sample temperature was measured by immersing an RTD in the DNA solution.]]
+
|Warning: you will need the code written in this assignment to analyze your data in Assignment 10. Drop at your own risk!!
|[[Image:DNA melting data corrected for temperature lag.png|thumb|300px|right|<math>V_{f,measured}</math> versus model sample temperature, <math>\theta_{sample}</math>.]]
+
 
|}
 
|}
  
===Implementation===
+
{{Template:Assignment Turn In|message=Turn in all of your work (comprehensive list below) on Stellar in a single PDF file named <lastname><firstname>Assignment9.pdf.
MATLAB includes several functions for simulating continuous-time, linear, shift-invariant (CTLSI) systems. You can write the transfer function of a CTLSI system as the ratio of two polynomials in <math>s</math>, where <math>s=j\omega</math>. For example, the transfer function of a low-pass filter can be expressed in the form <math>H(\omega)=\frac{1}{\tau s+1}=\frac{1}{\tau j \omega+1}</math>. The MATLAB function [http://www.mathworks.com/help/control/ref/tf.html tf] constructs a software object that represents a CTLSI transfer function. <tt>tf</tt> takes two arguments that specify the coefficients of the numerator and denominator polynomials. The arguments must be formatted as vectors containing the coefficients of <math>s</math> in order of decreasing power. For example, the transfer function <math>\frac{1}{60 s + 1}</math> can be created by:
+
 
+
<pre>
+
>> exampleTransferFunction = tf( 1, [ 60 1 ] )
+
exampleTransferFunction =
+
+
    1
+
  --------
+
  60 s + 1
+
+
Continuous-time transfer function.
+
</pre>
+
 
+
Several functions operate on continuous-time transfer function objects. For example <tt>bode</tt> and <tt>impulse</tt> generate bode and impulse response plots as expected. (Perhaps the Vice President of Inconsistent Naming at MathWorks was out sick the day those functions were named.) There is also a CTLSI simulator called [http://www.mathworks.com/help/control/ref/lsim.html lsim] that will be useful for deriving the sample temperature from the block temperature using a lumped-element model of the thermal low-pass system realized by the heating block, glass vial, and sample. (Seems like the Director of Inconsistencies was back in the office when <tt>lsim</tt> got named.)
+
 
+
The SimulateLowPass MATLAB function below uses <tt>lsim</tt> to produce the simulated output of a low-pass filter given its time constant, a vector of input values, and an optional vector of time values. If supplied, the time vector must be the same length as the input vector (''i.e.'' each time value corresponds to one input value). If no time vector is passed, the function generates a time vector assuming that the input values are sampled uniformly at ten times per second, starting at <math>t=0</math>. The DNA melter software produces ten samples per second.
+
 
+
<pre>
+
function SimulatedOutput = SimulateLowPass( TimeConstant, InputData, TimeVector )
+
    if( nargin < 3 )
+
        TimeVector = (0:(length(InputData)-1))/10;
+
    end
+
 
+
    transferFunction = tf( 1, [TimeConstant, 1] );
+
 
+
    initalTemperature = InputData(1);
+
    InputData = InputData - initalTemperature;
+
 
+
    SimulatedOutput = lsim( transferFunction, InputData, TimeVector' ) + initalTemperature;
+
end
+
</pre>
+
 
+
The function subtracts off the initial value before running the simulation and than adds the initial value back in after the simulation. This is a cheesy way to handle initial conditions &mdash; setting the temperature zero point to be equal to the initial temperature. This is equivalent to assuming that the block has been at its initial temperature during the time interval <math>t = [ -\infty, 0 )</math>.
+
 
+
==Photobleaching==
+
Excited dye molecules can react chemically with compounds in their environment and permanently lose their ability to fluoresce &mdash; a phenomenon called photobleaching. LED and ambient light illuminate the LC Green dye for 15 or more minutes over the course of a single melting and annealing cycle, which results in measurable photobleaching. The effect of photobleaching can be approximated by a mathematical model.
+
 
+
===Assumptions===
+
* The dye is divided into two populations, bleached and unbleached, with concentrations <span style="text-decoration: overline">S</span> and S
+
* The initial dye concentration is S + <span style="text-decoration: overline">S</span>.
+
* Only dye molecules bound to dsDNA may be excited.
+
* Only excited fluorophore molecules bleach.
+
* An excited fluorophore will either bleach with probability ''p'' or return to the ground state with probability 1-''p''.
+
** The constant 1-''p'' encompasses all of the mechanisms by which the fluorophore returns to the ground state unbleached, including fluorescence, phosphorescence, and non-radiative relaxation.
+
* Bleaching is irreversible.
+
* The number of molecules in the excited state is proportional to fluorescence.
+
* Bleached and unbleached molecules bind to dsDNA with the same affinity.
+
 
+
===Model===
+
Under these assumptions, the bleaching rate is proportional to fluorescence.
+
 
+
::<math>\frac{\partial \bar{S}}{\partial t} = K_{bleaching} f(t) </math>
+
 
+
''K<sub>bleaching</sub>'' encompasses several constants, including <math>p</math>, illumination intensity, optical gain of the instrument, and lock-in amplifier gain.
+
 
+
Setting the initial dye concentration to 1, the fraction of unbleached LC Green can be calculated by integrating:
+
 
+
::<math>S(t)=1-\bar{S}(t)=1-K_{bleaching}\int_0^t f(t) \mathrm{d}t</math>
+
 
+
===Implementation===
+
A simple way to approximate an integral in discrete time is a cumulative sum: <math>\int_0^t f(t) \approx \sum_{n=1}^{f F_s}f(n T_s)</math>, where <math>F_s</math> is the sample rate. This works particularly well when the signal changes slowly relative to the sample period. It's a good bet that the bleaching time constant is much longer than the 0.1 s sample period. MATLAB has a built-in function for computing cumulative sums called <tt>cumsum</tt>. Here is an example:
+
 
+
<pre>
+
>> cumsum( 1:10 )
+
ans =
+
    1    3    6    10    15    21    28    36    45    55
+
</pre>
+
 
+
<tt>cumsum</tt> will generate a very good numerical approximation to <math>\int_0^t f(t) \mathrm{d}t</math>.
+
 
+
But what is <math>f(t)</math> in this case? If you think about it, our model is becoming a bit circular. We are trying to ''find'' an expression for the fluorescence as it depends on time and temperature, but we need to ''already know'' the fluorescence in order to effectively model photobleaching? Kind of weird... It turns out this stuff happens all the time and we can solve the problem iteratively. That is, we can start with a somewhat-decent guess for <math>f(t)</math>, and use it to calculate a better guess for <math>f(t)</math>. To first-order, we already have a decent guess of the functional form of <math>f(t)</math> in our expression for <math>C_{ds}</math>. Using the <tt>DnaFraction</tt> function as a proxy for <math>f(t)</math> here works relatively well.
+
 
+
==Thermal quenching==
+
LC Green I fluoresces less efficiently at higher temperatures. The decrease in fluorescence is approximately linear. Defining the dye efficiency at the initial temperature to be 1, quenching can be modeled by:
+
 
+
::<math>\left . Q(t)=1-K_{quench}[T_{sample}(t)-T_{sample}(0)]\right .</math>
+
 
+
==Instrument gain and offset==
+
There are several optical and electronic gain factors built into the DNA melter that determine the instrument responsively, &part;''V<sub>f,measured</sub>'' / &part; ''C<sub>ds</sub>''. In addition, ''V<sub>f,measured</sub>'' may be offset from zero when ''C<sub>ds</sub>'' concentration is zero.
+
 
+
* ''K<sub>gain</sub>'' is equal to the change in ''V<sub>f,measured</sub>'' divided by the change in fraction of dsDNA. This is assumed to be constant.
+
* ''K<sub>offset</sub>'' is equal to the value of ''V<sub>f,measured</sub>'' when the dsDNA concentration is zero.
+
 
+
==Expression for model ''V<sub>f,model</sub>==
+
 
+
::<math>\left . V_{f,model}(t) = K_{gain} S(t) Q(t) C_{ds}(T_{sample}(t), \Delta H^\circ, \Delta S^\circ) + K_{offset}\right .</math>
+
 
+
''C<sub>ds</sub>'' is the model melting curve produced by the <code>DnaFraction</code> function from part 1 of the lab. Differences between the model and the observations are precisely visualized on a residual plot, described below.
+
 
+
==Finding double stranded DNA fraction from raw data==
+
[[Image:Corrected DNA data.png|thumb|right]]
+
 
+
The inverse function of the melting model with respect to ''V<sub>f,measured</sub>''(''t'') is helpful to visualize discrepancies between the model and experimental data caused by random noise in ''V<sub>f,measured</sub>'' and systematic error in the model ''V<sub>f,model''.  The function,
+
 
+
::<math>C_{ds,inverse-model}(V_{f,measured}(t)) = \frac{V_{f,measured}(t) - K_{offset}} {K_{gain} S(t) Q(t)}</math>,
+
 
+
is itself a model. This model estimates the concentration of double stranded DNA based on the observations <math>V_{f,measured}(t)</math> and the models for bleaching and quenching.
+
 
+
The estimated melting curve may be directly compared with simulations, measurements or other predictions of the true melting curve. The plot at right shows an example of ''C<sub>ds,inverse-model</sub>''(''t'') versus ''T<sub>sample</sub>''(''t''). The estimated melting curve is shifted to the right compared to the simulated melting curve, possibly due to systematic error in the sample temperature model. The estimated melting curve also serves as a comparison to the thermodynamic model developed in [[DNA Melting Thermodynamics]], or to any other independent measurement or model of the melting curve, i.e., the concentration of dsDNA vs sample temperature.
+
 
+
==Assignment details==
+
 
+
In this assignment you will write the code to analyze your DNA melting data in two parts:
+
# In Part 1, [[Assignment 9, Part 1: model function|you will define the functions for each phenomenon and combine them into a single fit function]];
+
# In Part 2, [[Assignment 9, Part 2: Simulating DNA melting data and testing the model function|you will create some simulated data to verify your code and test your model]].
+
  
{{Template:Assignment Turn In|message=Turn in all of your work (comprehensive list below) on Stellar in a single PDF file named <lastname><firstname>Assignment9.pdf.}}
+
Part 1: ('''individually''')
Turn in:
+
* Turn in your code for SimulateLowPass, SimulatePhotobleaching, SimulateThermalQuenching and Vmodel (once it has been tested!)
 +
Part 2: ('''individually''')
 +
* Plot the following items on the same set of axes. Don't forget to include a legend!
 +
*# your simulated fluorescence data as a function of block temperature
 +
*# your model function with initial best-guess parameters as a function of block temperature
 +
*# your model function with fitted par meters (obtained by nlinfit) as a function of block temperature
 +
* Calculate the uncertainty for each parameter from the fit, and fill in the uncertainty table, making sure to include appropriate units and significant figures.
 +
* Plot the normalized uncertainty for each fit parameter as a function of the noise magnitude. Which parameter appears to be the most sensitive to noise? the least?
 +
* Using the expressions we derived in class for [[DNA Melting Thermodynamics#Simulating DNA Melting|dsDNA concentration at a given temperature]], derive an expression for the melting temperature, <math>T_m</math>in terms of <math>\Delta H </math> and <math>\Delta S</math>. The melting temperature is defined as the temperature for which the fraction of double stranded DNA equals 0.5. Calculate the melting temperature for each noise level from 0.01 to 0.05.
 +
Part 3: ('''individually''')
 +
*# Plot your fluorescence data as a function of block temperature, your model function with initial guesses, and your model function with best fit parameters on the same set of axes.
 +
*# Record your estimates for &Delta;H and &Delta;S. Calculate T<sub>m</sub>.
 +
*# How do these thermodynamic parameters compare to the predicted values you obtained from DINAmelt or OligoCalc?
 +
* For one fit result, plot the residuals vs.
 +
*# time,
 +
*# temperature, and
 +
*# fluorescence.
 +
* Write a function to convert fluorescence into fraction of double stranded DNA. For at least one experimental trial, plot <math>\text{DnaFraction}_{inverse-model}</math> versus the ''sample temperature'' <math>T_{sample}</math> ([http://measurebiology.org/wiki/File:Inverse_cuvrve.png example plot]). On the same set of axes plot DnaFraction versus <math>T_{sample}</math> using the best-fit values of &Delta;H and &Delta;S. Finally, plot simulated dsDNA fraction vs. temperature using data from DINAmelt or another melting curve simulator.
 +
* Explain the statistical method you will use to identify your group's unknown sample in Assignment 10.
 +
*# State the acceptance/rejection criteria for any hypotheses tests you will use.
 +
*# This page may be a helpful reference: [[Identifying the unknown DNA sample]].
 +
Finally,
 +
*Append all of the code (not yet included) that you wrote for Parts 1, 2 and 3 of this assignment.
 +
}}
  
 
{{Template:Assignment 9 navigation}}
 
{{Template:Assignment 9 navigation}}

Latest revision as of 00:20, 25 January 2018

20.309: Biological Instrumentation and Measurement

ImageBar 774.jpg


Introduction

Measured photodiode voltage, Vf,measured plotted versus block temperature, θblock along with model photodiode voltage, Vf,model.

In Assignment 8, you made some improvements to your DNA melting instrument, and (hopefully) collected some spectacular data. This assignment will focus on extracting useful information from the data in order to make some quantitative conclusions.

The fluorescence voltage, Vf,measured(t), that you measured in lab depends not only on the parameters of interest, ΔH°, and ΔS°, but also on:

  • the double stranded DNA concentration Cds(t) (which we know from the outset)
  • the dynamics of the temperature cycling system
  • thermal quenching of the fluorophore
  • photobleaching
  • responsivity and offset of the instrument
  • binding kinetics of the dye

The goal for Assignment 9 is to write a model for Vf,measured that takes these effects into account and use nonlinear regression to estimate the parameters of this function. The model proposed here adheres to Dr. George E. P. Box's excellent advice on modeling, in that it is both wrong and useful. Some of the assumptions are more dubious than others. You might ask: "why don't we just fit the DNA melting curve to a higher order polynomial?" - great question. We are developing a mechanistic model, which means that we hope the fit parameters will give us some insight into the physical processes behind the DNA melting system. Fitting an arbitrary function may be useful to interpolate the data, but provides no physical insights.

Onward!

Assignment details

In this assignment you will write the code to analyze your DNA melting data in three parts:

  1. In Part 1, you will define the functions for each phenomenon and combine them into a single fit function;
  2. In Part 2, you will create some simulated data to verify your code and test your model.
  3. In Part 3, you will use the code on your real data and think about the statistical model you'll use to identify your unknown sample.


Warning: you will need the code written in this assignment to analyze your data in Assignment 10. Drop at your own risk!!


Pencil.png

Turn in all of your work (comprehensive list below) on Stellar in a single PDF file named <lastname><firstname>Assignment9.pdf.

Part 1: (individually)

  • Turn in your code for SimulateLowPass, SimulatePhotobleaching, SimulateThermalQuenching and Vmodel (once it has been tested!)

Part 2: (individually)

  • Plot the following items on the same set of axes. Don't forget to include a legend!
    1. your simulated fluorescence data as a function of block temperature
    2. your model function with initial best-guess parameters as a function of block temperature
    3. your model function with fitted par meters (obtained by nlinfit) as a function of block temperature
  • Calculate the uncertainty for each parameter from the fit, and fill in the uncertainty table, making sure to include appropriate units and significant figures.
  • Plot the normalized uncertainty for each fit parameter as a function of the noise magnitude. Which parameter appears to be the most sensitive to noise? the least?
  • Using the expressions we derived in class for dsDNA concentration at a given temperature, derive an expression for the melting temperature, $ T_m $in terms of $ \Delta H $ and $ \Delta S $. The melting temperature is defined as the temperature for which the fraction of double stranded DNA equals 0.5. Calculate the melting temperature for each noise level from 0.01 to 0.05.

Part 3: (individually)

    1. Plot your fluorescence data as a function of block temperature, your model function with initial guesses, and your model function with best fit parameters on the same set of axes.
    2. Record your estimates for ΔH and ΔS. Calculate Tm.
    3. How do these thermodynamic parameters compare to the predicted values you obtained from DINAmelt or OligoCalc?
  • For one fit result, plot the residuals vs.
    1. time,
    2. temperature, and
    3. fluorescence.
  • Write a function to convert fluorescence into fraction of double stranded DNA. For at least one experimental trial, plot $ \text{DnaFraction}_{inverse-model} $ versus the sample temperature $ T_{sample} $ (example plot). On the same set of axes plot DnaFraction versus $ T_{sample} $ using the best-fit values of ΔH and ΔS. Finally, plot simulated dsDNA fraction vs. temperature using data from DINAmelt or another melting curve simulator.
  • Explain the statistical method you will use to identify your group's unknown sample in Assignment 10.
    1. State the acceptance/rejection criteria for any hypotheses tests you will use.
    2. This page may be a helpful reference: Identifying the unknown DNA sample.

Finally,

  • Append all of the code (not yet included) that you wrote for Parts 1, 2 and 3 of this assignment.


Navigation

Back to 20.309 Main Page