Difference between revisions of "DNA Melting: Simulating DNA Melting - Intermediate Topics"

From Course Wiki
Jump to: navigation, search
(Introduction)
(Lab manual sections)
 
(94 intermediate revisions by 4 users not shown)
Line 1: Line 1:
<div style="padding: 10px; width: 774px; border: 3px solid #000000;">
+
[[Category:20.309]]
==Introduction==
+
[[Category:DNA Melting Lab]]
This tutorial will show you how to simulate the DNA melting experiment with Matlab. The raw data produced by the experimental apparatus consists of periodic measurements of two voltages. At the end of this tutorial, you should be able to produce simulated datasets that are similar to what you will record in the lab. You can use a simulated dataset to help develop and debug your data analysis scripts.
+
{{Template:20.309}}
 +
[[Category:Matlab]]
  
 +
===Add some noise to the RTD signal===
 +
You will certainly notice noise on the signals you capture in the lab. Noise will be an ''extremely'' important (and annoying) factor in this experiment because the melting curves will be differentiated. Differentiation emphasizes noise. In order to ensure that your data analysis script effectively handles the noise you are likely to observe, simulated random and impulse noise will be added to the signals.
  
It will be essential to have working data analysis scripts before you begin running experiments in the lab. When you have completed this tutorial, move on to the [[Matlab:DNA melting data analysis|data analysis tutorial]].
+
To simulate noise, we can use Matlab's <tt>randn</tt> function, which generates normally distributed random numbers with unity standard deviation. The two arguments tell <tt>randn</tt> to create a vector of random samples with size 1xlength(V_RTD).  
 
+
The procedure for creating the simulation is:
+
# Write a function to compute the equilibrium dsDNA contration vs. temperature
+
# Model sample cooling versus time during an experimental run
+
# Compute the RTD voltage from temperature vs. time data
+
# Add noise to the RTD signal
+
# Compute relative fluorescence intensity from temperature vs. time data
+
# Model photobleaching (optional)
+
# Scale the fluorescence intensity signal and add noise
+
 
+
Open a Matlab window and follow along as you read this tutorial. You will not have to write any code for this tutorial. Code from this page can be copied and pasted into the Matlab command window. M-files are available in the course locker.
+
 
+
If you prefer to work in Python, please see [[Python:Simulating DNA Melting|this page]].
+
 
+
===Before you begin===
+
*Read the [[Lab Manuals:DNA Melting|DNA Melting Lab Manual]].
+
*Review the [[DNA Melting Thermodynamics|DNA Melting Thermodynamics Notes]].
+
*Review [[Introduciton to Convolution and Digital Filtering|Introduction to Convolution and Digital Filtering]].
+
*If you need a review of Matlab fundamentals, see [[Matlab:Matlab Fundamentals|Matlab Fundamentals]] tutorial.
+
 
+
===Matlab skills===
+
* Array arithmetic (element-by-element) operators (<tt>+</tt>, <tt>-</tt>, <tt>.*</tt>, <tt>./</tt>)
+
* Functions declared in an m-file
+
* Numerical differentiation
+
 
+
==DNA annealing==
+
===Write a function to compute the fraction of dsDNA at a given temperature===
+
The dsDNA concentration at a given temperature can be computed with the equation that we derived [[DNA Melting Thermodynamics#Simulating DNA Melting|in class]]:
+
:<math>
+
K_{eq} = e^\left [\frac{\Delta S^{\circ}}{R} - \frac{\Delta H^{\circ}}{R T} \right ]
+
</math>
+
:<math>
+
f = \frac{1 + C_T K_{eq} - \sqrt{1 + 2 C_T K_{eq}}}{C_T K_{eq}}
+
</math>
+
 
+
Since this is a very complicated equation and we will use it many times, it makes sense to write a Matlab function. An m-file containing the code shown here is available online [http://web.mit.edu/~20.309/www/Matlab/DnaFraction.m here]. Remember that Matlab functions must be saved in your current directory or another directory in the Matlab path. The name of the file must be the function name appended with ".m".
+
 
+
<include src="http://web.mit.edu/~20.309/www/Matlab/DnaFraction.m" />
+
 
+
The <tt>DnaFraction</tt> function takes an optional fifth argument for the gas constant so that different unit systems can be used. The Matlab function [http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/nargin.html nargin] is used to determine whether a fifth argument has been supplied. If not, R is set to the default value of 1.987.
+
 
+
Using the element-by-element divide operator makes it possible for <tt>DnaFraction</tt> to accept either a single value of temperature or a vector of temperature samples. (As a general rule, use the ./ operator instead of / unless you intend to do a matrix divide.)
+
 
+
===Ideal melting curve===
+
Since '''DnaFraction''' is written to accept a vector temperature argument, generating a melting curve is simple:
+
 
+
<pre>
+
ideal = struct();                          % create an empty structure to hold the results
+
ideal.temperature = 0:99 + 273.15;        % temperature axis in Kelvin: 100 points(0-99C)
+
 
+
% call DnaFraction function
+
% arguments: concentration, temperature vector, delta S, delta H
+
% (optional R argument not supplied)
+
 
+
ideal.dsDnaFraction = DnaFraction(1E-6, ideal.temperature, -184, -71E3);
+
 
+
plot(ideal.temperature, ideal.dsDnaFraction);
+
title('Ideal DNA Melting Curve');
+
xlabel('Temperature (K)');
+
ylabel('dsDNA Fraction');
+
</pre>
+
[[Image:Ideal DNA Melting Curve.jpg|350 px|Ideal DNA melting curve]]
+
 
+
Using a structure to store results (<tt>ideal</tt>) allows data from several simulations to be kept in the workspace without getting them confused. The struct function creates an empty structure. (Since variables are dynamically typed in Matlab, this command is optional; however, explicitly creating the structure can make your code easier to read and better performing.)
+
 
+
Experiment with different values of <math>\Delta S^{\circ}</math> and <math>\Delta H^{\circ}</math> to see how the curve changes. You can use [http://www.basic.northwestern.edu/biotools/oligocalc.html OligoCalc] to estimate values for different sequences.
+
 
+
One method of estimating ''T<sub>m</sub>'' is to find the temperature where the dsDNA fraction equals 1/2. Another method is to differentiate the melting curve. ''T<sub>m</sub>'' occurs at the peak of the derivative &mdash; where the dsDNA fraction is changing the fastest. (Why might you prefer one method to the other when estimating ''T<sub>m</sub>'' from experimental data?)
+
 
+
===Numerical differentiation===
+
The fluorescence function ''f<sub>N</sub>'' is a sampled version of the continuous function ''f(T)'' such that ''f<sub>N</sub> = f(N&Delta;T)'', where ''&Delta;T'' is the sampling interval (1&deg; C, in this instance). One method for computing a numerical approximation to ''df/dT'' is the finite difference:
+
:<math>
+
{f'}_N = \frac{f_{N+1}-F_{N}}{\Delta T}</math>.
+
''dF/dT'' is guaranteed to take on the value given by ''f'<sub>N</sub>'' at some point between ''N&Delta;T'' and ''(N+1)&Delta;T''; however, this happy coincidence can occur at any place in the interval. A simple but effective technique is to guess that equality occurs in the middle of the interval. This guess is close enough in most cases.
+
 
+
The code below computes the finite difference (<tt>ideal.dDsDnaFraction</tt>) and generates an associated temperature vector (<tt>ideal.dTemperature</tt>) with values centered between the original samples. Note that both vectors are one element shorter than the original data.
+
 
+
<pre>
+
ideal.dTemperature = ideal.temperature(1:99) + 0.5; % create new temperature vector
+
ideal.dDsDnaFraction = diff(ideal.dsDnaFraction);  % take the difference
+
 
+
plot(ideal.dTemperature, -1 * ideal.dDsDnaFraction);
+
title('Derivative of Ideal DNA Melting Curve');
+
xlabel('Temperature (K)');
+
ylabel('Delta dsDNA Fraction');
+
</pre>
+
[[Image:Derivative of Ideal DNA Melting Curve.jpg|350 px|Derivative of ideal DNA melting curve]]
+
 
+
In the previous section, the plot shows that the dsDNA fraction equals 0.5 at about 332.5 degrees. The peak of the derivative also occurs at the same temperature. (The curve is poltted inverted.)
+
 
+
==Simulate an experimental run==
+
===Sample cooling===
+
During each experimental run, the range of temperatures will be provided by natural cooling of the sample after it has been heated in a bath. At the beginning of an experimental run, the DNA + fluorescent dye sample is heated in a bath to around 90&deg;C. Over a period of about 15 minutes, the sample cools to room temperature (about 20&deg;C). Cooling can be approximated by an exponential function with a time constant of 3 minutes. The equation for temperature (in &deg;K) versus time is:
+
 
+
:<math>
+
T(t) = 90 e ^ {-\frac{t}{180}} + 293
+
</math>
+
 
+
In Matlab:
+
 
+
<pre>
+
simulationLength = 900;                              % 15 minute simulation
+
sampleRate = 1;                                      % 1 Hz sample rate
+
sim1 = struct();                                      % create empty struct for sim resluts
+
sim1.time = 0:(1./sampleRate):simulationLength;      % create time vector (units: seconds)
+
sim1.temperature = 90 * exp(-sim1.time ./ 180) + 293; % exponential cooling from 383K to 293K
+
 
+
plot(sim1.time, sim1.temperature, 'r');
+
xlabel('Time (sec)');
+
ylabel('Temperature (K)');
+
title('Simulated Temperature vs. Time');
+
</pre>
+
[[Image:DNA Sample Cooling.jpg|350 px|Simulated sample cooling curve]]
+
 
+
If you have any questions about the element-by-element division operator in Matlab (./) see:[[Matlab:Division operators|this page]].
+
 
+
===Model the RTD and voltage divider===
+
The resistance of the RTD is given by the equation ''R<sub>RTD</sub> = 1000 + 3.85(T - 273)''. In the experimental apparatus, the RTD is hooked up as one element of a voltage divider with another resistor, R. The power supply is 15 volts. Thus, the output voltage is: ''V<sub>RTD</sub> = 15 * R<sub>RTD</sub>/(R + R<sub>RTD</sub>)''. The following code converts the temperature vector <tt>sim1.temperature</tt> into the equivalent RTD voltage measurements, assuming R = 20K&Omega;.
+
 
+
<pre>
+
sim1.R_RTD = 1000 + 3.85 * (sim1.temperature - 273);      % Compute RTD resistance in Ohms
+
sim1.V_RTD = 15 * sim1.R_RTD ./ (20E3 + sim1.R_RTD);      % Compute V_RTD in Volts
+
plot(sim1.time, sim1.V_RTD, 'r');
+
xlabel('Time (sec)');
+
ylabel('RTD Voltage (V)');
+
title('Simulated RTD Voltage vs. Time');
+
</pre>
+
 
+
[[Image:RTD Voltage vs Time.jpg|350px|Simulated RTD voltage plot]]
+
 
+
===Add some noise===
+
You will certainly notice some noise on the signals you capture. (See: [[Where does electronic noise come from?]].) Noise will be an ''extremely'' important (and annoying) factor in this experiment because the melting curve will be differentiated. Differentiation emphasizes noise. (Think about the [[Matlab:Numerical differentiation|frequency response of a differentiator]].) In order to ensure that your data analysis script effectively handles the noise you are likely to observe, simulated random and impulse noise will be added to the signals.
+
 
+
To simulate noise, we can use Matlab's [http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/randn.html '''randn'''] function, which generates normally distributed random numbers. The two arguments tell '''randn''' to create a vector of random samples with size 1xlength(V_RTD).  
+
  
 
<pre>
 
<pre>
noise = 0.001 * randn(1,length(sim1.V_RTD));    % Create noise vector, standard deviation=1mV
+
noise = 0.001 * randn(1,length(sim1.V_RTD))';    % Create noise vector, standard deviation=1mV
 
sim1.V_RTD = sim1.V_RTD + noise;                % Add simulated noise to the voltage signal
 
sim1.V_RTD = sim1.V_RTD + noise;                % Add simulated noise to the voltage signal
  
 +
figure(8)
 
plot(sim1.time, sim1.V_RTD, 'r');
 
plot(sim1.time, sim1.V_RTD, 'r');
 
xlabel('Time (sec)');
 
xlabel('Time (sec)');
Line 150: Line 20:
 
</pre>
 
</pre>
  
[[Image:RTD Voltage with Noise vs Time.jpg|350px|Simulated RTD voltage plot]]
+
[[Image:RTDNoisyHeatCoolCurve.png|350px|Simulated RTD voltage plot]]
  
==Model the sample fluorescence signal==
+
Revisit your code from [[DNA Melting: Simulating DNA Melting - Basics]] and using the approach above add noise to your RTD voltage signal, then explore the behavior of your code for different levels of noise. In particular, you can add differentiation of the curve from Figure 6 and discover the issues with attempts to differentiate that signal without additional processing. You will likely find that you first need to either sort or bin your temperature data so that the f vs T function can be differentiated. Then also run this data through your fit function and observe any difference in the fit statistics.
Now that we have a simulated RTD voltage signal, the next step is to simulate the fluorescence signal. Remember that fluorescence intensity is proportional to the amount of dsDNA in the sample.  
+
  
===Create a dsDNA fraction (relative fluorescence) vs. time curve===
+
You are also encouraged to add noise at additional frequencies that you think might be present in the lab, then attempt to filter that data in Matlab to test the effectiveness of the RTD voltage filter design that you built in the lab. To do this, generate the noise (perhaps 60 Hz, pink noise, possibly high frequency noise from switching in the overhead lights, or even noise on the earth ground that could be coming from equipment elsewhere in the building) and add it to the signal from Figure 8. Remember it is a long way to the earth from our lab. Next filter that signal using your transfer function and the matlab command lsim. Then again run this data through your fit function.
Fluorescence intensity is assumed to be proportional to the concentration of dsDNA. The magnitude of fluorescence intensity is the highest when all of the DNA is annealed, which corresponds to a value of 1 on the dsDNA fraction curve.  
+
  
<pre>
+
===Add some noise to the sample fluorescence signal===
% DnaFraction arguments: concentration,temp. vector, delta S, delta H (,optional R)
+
Now that we have added noise to the RTD voltage signal, the next step is to do the same for the fluorescence signal.  
sim1.dsDnaFraction = DnaFraction(1E-6, sim1.temperature, -184, -71E3); 
+
 
+
plot(sim1.time,sim1.dsDnaFraction)
+
title('Fraction dsDNA vs. Time');
+
xlabel('Time (s)');
+
ylabel('dsDNA Fraction');
+
</pre>
+
  
[[Image:Simulated DNA Fraction.jpg|350px|Simulated RTD voltage plot]]
+
The output range of the transimpedance amplifier depends on many factors, including: dsDNA quantity, illumination intensity, fluorophore efficiency, optical system gain, photodiode efficiency, and electronic amplifier gain. All of the factors except dsDNA concentration essentialy remain constant. Thus, the output of the transimpedance amplifier can be modeled by a single overall system gain factor times the dsDNA fraction. In addition, the amplifier output may be offset by a constant DC value, which is simple to model. (If the output of your apparatus drifts significantly over time, you will need to improve your implementation and methods to minimize this.)
  
===Scale the curve and add some noise===
 
 
The output range of the transimpedance amplifier over the course of an experimental run can vary depending on several factors, including the amplifier gain, optical system gain, etc... A typical setup might produce values between, say, 0.5 and 7.2 Volts. The simulation code generates two random numbers to set the range and minimum value of the data. Your data analysis script will have to normalize the amplifier output voltage to a fractional value between 0 and 1. (What assumptions will you make about dsDNA concentration to do the normalization?)
 
The output range of the transimpedance amplifier over the course of an experimental run can vary depending on several factors, including the amplifier gain, optical system gain, etc... A typical setup might produce values between, say, 0.5 and 7.2 Volts. The simulation code generates two random numbers to set the range and minimum value of the data. Your data analysis script will have to normalize the amplifier output voltage to a fractional value between 0 and 1. (What assumptions will you make about dsDNA concentration to do the normalization?)
  
In addition to Gaussian noise, the high gain amplifier can produce impulse noise. This can be caused by small static discharges, for example when a charged person sits down in a chair in the lab. It is a good idea to include some filtering for impulse noise in your data analysis script, in case your apparatus is susceptible to impulse noise.
+
In addition to white Gaussian noise <ref> [http://en.wikipedia.org/wiki/White_noise White noise]</ref> (including shot noise from the photodiode and Johnson noise in your feedback resistors), the high gain amplifier sometimes produces impulse noise. This can be caused by small static discharges, for example when a charged person sits down in a chair in the lab, or by machinery switching on or off somewhere else in the building. Each of these latter noise sources would generally show up on your ground reference, even when connected to the building earth ground, an often imperfect grounding source. Your goal is first to provide as much filtering as possible in your circuitry. However, these noise sources are difficult to completely eliminate, so it is a good idea to include some filtering in your data analysis script.
  
 
<pre>
 
<pre>
Line 179: Line 39:
 
Minimum = 2 * rand() - 1;                            % Minimum between +/- 1
 
Minimum = 2 * rand() - 1;                            % Minimum between +/- 1
 
sim1.V_f = Range * sim1.dsDnaFraction + Minimum;    % Scale output
 
sim1.V_f = Range * sim1.dsDnaFraction + Minimum;    % Scale output
noise = 0.01 * randn(1,length(sim1.V_f));           % Random noise, standard deviation=10mV
+
noise = 0.01 * randn(1,length(sim1.V_f))';           % Random noise, standard deviation=10mV
 
sim1.V_f = sim1.V_f + noise;                        % Add random noise
 
sim1.V_f = sim1.V_f + noise;                        % Add random noise
noise = rand(1,length(sim1.V_f));                   % Uniformly distributed  random numbers
+
noise = rand(1,length(sim1.V_f))';                   % Uniformly distributed  random numbers
 
ImpulseNoiseProbability = 1 ./ 75;                  % Probability of impulse noise
 
ImpulseNoiseProbability = 1 ./ 75;                  % Probability of impulse noise
 
noise = 20 * (noise < ImpulseNoiseProbability) - 10; % Quantize impulse noise
 
noise = 20 * (noise < ImpulseNoiseProbability) - 10; % Quantize impulse noise
sim1.V_f = max(sim1.V_f, noise);                    % Add in impulse noise
+
sim1.V_fNoisy = max(sim1.V_f, noise);                    % Add in impulse noise
  
plot(sim1.time, sim1.V_f)
+
figure(9)
 +
plot(sim1.time, sim1.V_fNoisy)
 
title('Fluorescence Voltage vs. Time');
 
title('Fluorescence Voltage vs. Time');
 
xlabel('Time (s)');
 
xlabel('Time (s)');
 
ylabel('Fluorescence Voltage (V)');
 
ylabel('Fluorescence Voltage (V)');
 
</pre>
 
</pre>
[[Image:Simulated Fluorescence Output.jpg|350px|Simulated RTD voltage plot]]
 
  
===Reformat the data===
+
[[Image:SimNoisyDNAFractionVsTime.png|350px|Simulated RTD voltage plot]]
When you import data from the LabView program that records DNA melting data, the format will be an N x 2 matrix with <math>V_T</math> values in column 1 and <math>V_f</math> in column 2. To put the simulated data in the same format, you can use Matlab's transpose operator (.') and the horizontal concatenation function [http://www.mathworks.com/access/helpdesk/help/techdoc/ref/horzcat.html horzcat]:
+
  
<pre>
+
As in the last section, revisit your code from DNA Melting: Simulating DNA Melting - Basics but this time add noise to your fluorescence voltage signal, then explore the behavior of your code for different levels of noise. Then again run this data through your fit function.
simData = horzcat( sim1.V_RTD .', sim1.V_f .');
+
size(simData)
+
  
 +
Here too you are encouraged to add noise at additional frequencies that you think might be present in the lab. What could you do to minimize the effects of these noise sources? Think of solutions both at the hardware level and at the software level and model both of them.
  
ans =
+
===Add photobleaching effects===
 +
Each time a SYBR Green molecule is excited by a photon, there is a small chance that it will be destroyed and permanently lose its ability to fluoresce. The effect of photobleaching on the measured fluorescence intensity can be significant. To demonstrate this, a model for photobleaching may be added to the simulation. In the model, a small, fixed percentage of the fluorophores are mathematically destroyed each time sample. This can be implemented with a simple integration. The derivation of this model can be found [http://dl.dropbox.com/u/12957607/Bleaching%20Correction%20Derivation.pdf here].
  
  901    2
+
The model does not completely describe observations, so in the code below, we have also added an additional time-linear bleaching relationship that has helped to explain the observed behavior. The fraction of fluorophores destroyed per time sample is set by the variable <tt>bleachingCoefficient</tt>, while the coefficient for the linear case is <tt>bleachingConstant</tt>.
</pre>
+
 
+
Please see [[Common Matlab confusions#Transpose operator|this page]] if you are uncertain about the .' operator.
+
 
+
Congratulations. You now have a simulated dataset that should be quite similar to what you measure in the lab.
+
 
+
==Improving the model: photobleaching (optional)==
+
Each time a SYBR Green molecule is excited by a photon, there is a small chance that it will be destroyed and permanently lose its ability to fluoresce. The effect of photobleaching on the measured fluorescence intensity can be significant. To demonstrate this, a model for photobleaching may be added to the simulation. In the model, a small, fixed percentage of the fluorophores are mathematically destroyed each time sample. This simple difference equation can be implemented with a <tt>for</tt>&#0133;<tt>end</tt> loop.
+
 
+
:<math>
+
{\hat{f}}_{N} = B {\hat{f}}_{N-1} + f_{N} - f_{N-1}
+
</math>
+
where,
+
:<math>{\hat{f}}_N</math> is the Nth sample of the bleached relative fluorescence,
+
:<math>\left . B \right .</math> is a bleaching constant equal to the fraction of fluorophore not destroyed per time sample, and
+
:<math>\left . f_N \right .</math> is the Nth sample of the dsDNA fraction (relative fluorescence) curve.
+
 
+
The fraction of fluorophores destroyed per time sample is set by the variable <tt>bleachingConstant</tt>. In the example code, the constant is set so that half of the fluorophores will be destroyed over the course of the simulated experimental run.
+
  
 
<pre>
 
<pre>
sim1.bleachingConstant = (0.5)^(1/900);              % 1/2 of fluorphores fail over expmt.
+
sim1.filtV_f = medfilt1(sim1.V_fNoisy,15); % temporarily remove shot noise
sim1.diffDsDnaFraction = diff(sim1.dsDnaFraction);   % compute difference
+
sim1.fluorescenceNorm = sim1.filtV_f / max(sim1.filtV_f); % normalize photo diode signal to dsDNA fraction
sim1.dsDnaFractionBleached(1) = 0.05;               % set initial condition
+
sim1.bleachingCoefficient = 3e-5;              % 1/s, estimated from previous experiments
 +
sim1.bleachingConstant = 3e-4;              % 1/s, estimated from previous experiments
 +
sim1.bleachingCorrection = (1 - sim1.bleachingCoefficient ...
 +
    .* cumsum(1 .* sim1.fluorescenceNorm)) ...
 +
    .* (1 - sim1.bleachingConstant .* (0:(length(sim1.fluorescenceNorm)...
 +
    - 1))');   % calclation of bleaching correction by two methods, exponetial and linear
 +
sim1.fluorescenceBleached = sim1.fluorescenceNorm ...
 +
    .* sim1.bleachingCorrection;
 +
Range = 5 + 4 * rand();                              % Range between 5 and 9
 +
Minimum = 2 * rand() - 1;                            % Minimum between +/- 1
 +
sim1.V_fScaled = Range * sim1.fluorescenceBleached + Minimum;    % Scale output
 +
noise = rand(1,length(sim1.V_fScaled))';                  % Uniformly distributed  random numbers
 +
ImpulseNoiseProbability = 1 ./ 75;                  % Probability of impulse noise
 +
noise = 20 * (noise < ImpulseNoiseProbability) - 10; % Quantize impulse noise
 +
sim1.V_fBleached = max(sim1.V_fScaled, noise);                     % Add in impulse noise
  
% for loop to compute difference equation
+
figure(10)
for ii = 2:length(sim1.dsDnaFraction)
+
plot(sim1.time, sim1.V_fBleached)
    sim1.dsDnaFractionBleached(ii) = sim1.dsDnaFractionBleached(ii - 1) * ...
+
          sim1.bleachingConstant + sim1.diffDsDnaFraction(ii - 1);
+
end
+
 
+
plot(sim1.time, sim1.dsDnaFractionBleached)
+
 
title('Relative Fluorescence vs. Time with Bleaching');
 
title('Relative Fluorescence vs. Time with Bleaching');
 
xlabel('Time (s)');
 
xlabel('Time (s)');
Line 240: Line 90:
 
</pre>
 
</pre>
  
[[Image:Relative Fluorescence With Photobleaching.jpg|350px|Fluorescence vs. time with photobleaching]]
+
[[Image:SimNoisyBleachedDNAFractionVsTime.png|350px|Fluorescence vs. time with photobleaching]]
 
+
The value of the first element of the vector <tt>sim1.dsDnaFractionBleached</tt> sets the initial condition for the difference equation &mdash; the relative fluorescence at the beginning of the simulation. Because SYBR green exhibits a small amount of intrinsic fluorescence when it is not bound to dsDNA, the initial value of <tt>sim1.dsDnaFractionBleached</tt> is set to a small number to model this effect.
+
 
+
Experiment with the bleaching constant and intrinsic fluorescence values to see how they distort the melting curve. Increasing the bleaching constant is equivalent to turning down the illumination intensity: fewer fluorphores will be destroyed in a given time. Unfortunately, the overall fluorescence intensity also decreases, necessitating higher amplifier gain and decreased signal to noise ratio. How will you decide on the optimal illumination and amplification levels in the lab? Can you use this model to correct for photobleaching effects in your data? How can you determine the bleaching constant experimentally?
+
 
+
If you would like to model photobleaching in your data analysis, follow the steps above for scaling the relative fluorescence signal, adding noise, and reformatting.
+
 
+
==Data analysis advice==
+
In outline, the steps to produce a melting curve from <math>V_{RTD}</math> and <math>V_f</math> are:
+
# Filter V_RTD and V_f to remove noise.
+
# Transform RTD voltage to temperature.
+
# Transform photodiode current to relative fluorescence.
+
# Ensure that the melting function is uniquely values by combining samples with identical temperature values.
+
# Differentiate the resulting function with respect to ''temperature''.
+
 
+
T<sub>m</sub> can be estimated directly from the melting curve or by finding the peak of the melting curve's derivative with respect to temperature. (Why would you prefer one method over the other?)
+
Your data analysis script will have to convert the RTD and fluorescence intensity voltage signals into estimates of the temperature and dsDNA fraction.
+
 
+
 
+
===Filtering===
+
If you are not familiar with implementing digital filters in Matlab, review the [[Matlab:Convolution and Digital Filtering|Convolution and digital filtering in Matlab]] tutorial.
+
 
+
There are (at least)4 different convolution functions and 3 filtering functions in Matlab. Choosing the correct one is not always straightforward.
+
Initial conditions
+
Group delay
+
 
+
===Uniquification===
+
Sorting
+
Binning
+
  
===Numerical differentiation===
+
==References==
diff function
+
<references />
noise
+
  
</div>
+
{{Template:20.309 bottom}}

Latest revision as of 21:46, 17 August 2017

20.309: Biological Instrumentation and Measurement

ImageBar 774.jpg

Add some noise to the RTD signal

You will certainly notice noise on the signals you capture in the lab. Noise will be an extremely important (and annoying) factor in this experiment because the melting curves will be differentiated. Differentiation emphasizes noise. In order to ensure that your data analysis script effectively handles the noise you are likely to observe, simulated random and impulse noise will be added to the signals.

To simulate noise, we can use Matlab's randn function, which generates normally distributed random numbers with unity standard deviation. The two arguments tell randn to create a vector of random samples with size 1xlength(V_RTD).

noise = 0.001 * randn(1,length(sim1.V_RTD))';    % Create noise vector, standard deviation=1mV
sim1.V_RTD = sim1.V_RTD + noise;                % Add simulated noise to the voltage signal

figure(8)
plot(sim1.time, sim1.V_RTD, 'r');
xlabel('Time (sec)');
ylabel('RTD Voltage (V)');
title('Simulated RTD Voltage with Noise vs. Time');

Simulated RTD voltage plot

Revisit your code from DNA Melting: Simulating DNA Melting - Basics and using the approach above add noise to your RTD voltage signal, then explore the behavior of your code for different levels of noise. In particular, you can add differentiation of the curve from Figure 6 and discover the issues with attempts to differentiate that signal without additional processing. You will likely find that you first need to either sort or bin your temperature data so that the f vs T function can be differentiated. Then also run this data through your fit function and observe any difference in the fit statistics.

You are also encouraged to add noise at additional frequencies that you think might be present in the lab, then attempt to filter that data in Matlab to test the effectiveness of the RTD voltage filter design that you built in the lab. To do this, generate the noise (perhaps 60 Hz, pink noise, possibly high frequency noise from switching in the overhead lights, or even noise on the earth ground that could be coming from equipment elsewhere in the building) and add it to the signal from Figure 8. Remember it is a long way to the earth from our lab. Next filter that signal using your transfer function and the matlab command lsim. Then again run this data through your fit function.

Add some noise to the sample fluorescence signal

Now that we have added noise to the RTD voltage signal, the next step is to do the same for the fluorescence signal.

The output range of the transimpedance amplifier depends on many factors, including: dsDNA quantity, illumination intensity, fluorophore efficiency, optical system gain, photodiode efficiency, and electronic amplifier gain. All of the factors except dsDNA concentration essentialy remain constant. Thus, the output of the transimpedance amplifier can be modeled by a single overall system gain factor times the dsDNA fraction. In addition, the amplifier output may be offset by a constant DC value, which is simple to model. (If the output of your apparatus drifts significantly over time, you will need to improve your implementation and methods to minimize this.)

The output range of the transimpedance amplifier over the course of an experimental run can vary depending on several factors, including the amplifier gain, optical system gain, etc... A typical setup might produce values between, say, 0.5 and 7.2 Volts. The simulation code generates two random numbers to set the range and minimum value of the data. Your data analysis script will have to normalize the amplifier output voltage to a fractional value between 0 and 1. (What assumptions will you make about dsDNA concentration to do the normalization?)

In addition to white Gaussian noise [1] (including shot noise from the photodiode and Johnson noise in your feedback resistors), the high gain amplifier sometimes produces impulse noise. This can be caused by small static discharges, for example when a charged person sits down in a chair in the lab, or by machinery switching on or off somewhere else in the building. Each of these latter noise sources would generally show up on your ground reference, even when connected to the building earth ground, an often imperfect grounding source. Your goal is first to provide as much filtering as possible in your circuitry. However, these noise sources are difficult to completely eliminate, so it is a good idea to include some filtering in your data analysis script.

Range = 5 + 4 * rand();                              % Range between 5 and 9
Minimum = 2 * rand() - 1;                            % Minimum between +/- 1
sim1.V_f = Range * sim1.dsDnaFraction + Minimum;     % Scale output
noise = 0.01 * randn(1,length(sim1.V_f))';           % Random noise, standard deviation=10mV
sim1.V_f = sim1.V_f + noise;                         % Add random noise
noise = rand(1,length(sim1.V_f))';                   % Uniformly distributed  random numbers
ImpulseNoiseProbability = 1 ./ 75;                   % Probability of impulse noise
noise = 20 * (noise < ImpulseNoiseProbability) - 10; % Quantize impulse noise
sim1.V_fNoisy = max(sim1.V_f, noise);                     % Add in impulse noise

figure(9)
plot(sim1.time, sim1.V_fNoisy)
title('Fluorescence Voltage vs. Time');
xlabel('Time (s)');
ylabel('Fluorescence Voltage (V)');

Simulated RTD voltage plot

As in the last section, revisit your code from DNA Melting: Simulating DNA Melting - Basics but this time add noise to your fluorescence voltage signal, then explore the behavior of your code for different levels of noise. Then again run this data through your fit function.

Here too you are encouraged to add noise at additional frequencies that you think might be present in the lab. What could you do to minimize the effects of these noise sources? Think of solutions both at the hardware level and at the software level and model both of them.

Add photobleaching effects

Each time a SYBR Green molecule is excited by a photon, there is a small chance that it will be destroyed and permanently lose its ability to fluoresce. The effect of photobleaching on the measured fluorescence intensity can be significant. To demonstrate this, a model for photobleaching may be added to the simulation. In the model, a small, fixed percentage of the fluorophores are mathematically destroyed each time sample. This can be implemented with a simple integration. The derivation of this model can be found here.

The model does not completely describe observations, so in the code below, we have also added an additional time-linear bleaching relationship that has helped to explain the observed behavior. The fraction of fluorophores destroyed per time sample is set by the variable bleachingCoefficient, while the coefficient for the linear case is bleachingConstant.

sim1.filtV_f = medfilt1(sim1.V_fNoisy,15); % temporarily remove shot noise
sim1.fluorescenceNorm = sim1.filtV_f / max(sim1.filtV_f); % normalize photo diode signal to dsDNA fraction
sim1.bleachingCoefficient = 3e-5;              % 1/s, estimated from previous experiments
sim1.bleachingConstant = 3e-4;               % 1/s, estimated from previous experiments
sim1.bleachingCorrection = (1 - sim1.bleachingCoefficient ...
    .* cumsum(1 .* sim1.fluorescenceNorm)) ...
    .* (1 - sim1.bleachingConstant .* (0:(length(sim1.fluorescenceNorm)...
    - 1))');    % calclation of bleaching correction by two methods, exponetial and linear
sim1.fluorescenceBleached = sim1.fluorescenceNorm ...
    .* sim1.bleachingCorrection;
Range = 5 + 4 * rand();                              % Range between 5 and 9
Minimum = 2 * rand() - 1;                            % Minimum between +/- 1
sim1.V_fScaled = Range * sim1.fluorescenceBleached + Minimum;     % Scale output
noise = rand(1,length(sim1.V_fScaled))';                   % Uniformly distributed  random numbers
ImpulseNoiseProbability = 1 ./ 75;                   % Probability of impulse noise
noise = 20 * (noise < ImpulseNoiseProbability) - 10; % Quantize impulse noise
sim1.V_fBleached = max(sim1.V_fScaled, noise);                     % Add in impulse noise

figure(10)
plot(sim1.time, sim1.V_fBleached)
title('Relative Fluorescence vs. Time with Bleaching');
xlabel('Time (s)');
ylabel('Relative Fluorescence');

Fluorescence vs. time with photobleaching

References

  1. White noise