Difference between revisions of "20.109(S17):Examine quantitative PCR results (Day9)"

Latest revision as of 15:44, 12 April 2017

20.109(S17): Laboratory Fundamentals of Biological Engineering

Schedule Spring 2017 Announcements Assignments Homework Communication
1. High-throughput ligand screening 2. Gene expression engineering 3. Biomaterials engineering

Introduction

Today is the final laboratory session for Module 2! You have completed all of the bench work for your research; however, there is still data analysis to complete for the cell viability assay and the qPCR experiment. In addition to plotting and normalizing the data, you will complete statistical analysis to determine the significance of your results.

Statistics are mathematical tools used to analyze, interpret, and organize data. The specific tools that you will use are confidence intervals (CI) and the Student's t-test. To begin, review the following definitions:

Mean (or average) is defined as:

With infinite data, the mean (χ_ι) approaches the true mean (μ).
Standard deviation measures the variation in the data and is defined as:

With infinite data, the standard deviation (s) approaches the true standard deviation (σ).

Because standard deviation is only justified when sufficient data have been collected to generate a normal curve, you will use confidence intervals to report the likelihood that your results predict the true mean. A confidence interval is a defined interval that is calculated to define the true mean to a specified level of confidence. Simply, it is possible to define a range in your data set that likely contains the true mean based on the calculated mean.

Confidence interval is defined as:

In your data, you should use the CI to generate error bars due the low n. Be sure to report which confidence level was used to calculate the intervals reported. So, what does this all mean in regard to the data you will report? As an example, if the calculated χ_ι of a data set equals 80 au there is a 95% chance the μ is between 50 au and 110 au, where au = arbitrary units. And how does this relate to s? If you know the μ, the σ represents a 68% confidence interval.

Lastly, you will use Student's t test to report if your data are statistically different between treatments.

Student's t test is defined as:

The value you calculate with the Student's t test equation is referred to as t_calculated. This t_calculated value is compared to the t_tabulated value in the the t table, according to the appropriate n - 1 using the p-value for the two-tailed distribution (which assumes that you do not know how the data will shift). If the t_calculated value is greater than the t_tabulated, then the data sets are significantly different at the specific p-value. So, what does this all mean in regard to the data you will report? As an example, if the t_calculated for a data set with n - 1 = 10 is 3 (given that the t_tabulated is 2.228), then the data sets are different with a p-value ≤ 0.05. Which means that there is less that a 5% chance that the data sets are the same.

Protocols

Part 1: Practice statistical analysis

Review these data from an experiment where cells were exposed to increasing amounts of radiation. Your goal is to determine if a statistically significant amount of DNA damage was induced. For the purpose of this exercise, the values in the spreadsheet are in arbitrary units of 'DNA damage', where the higher numbers indicate more damage.

When interpreting the statistics, consider how you may use the information to convince someone that the DNA damage was significant. You may find this spreadsheet, originally created by Prof. Bevin Engelward and modified by the 20.109 staff, helpful for this exercise. At a minimum, you should post a bar plot of the data with 95% confidence intervals and indicate if there is a statistically significant difference (i.e. provide a p-value) between conditions in your Benchling notebook.

Part 2: Analyze quantitative PCR data

As discussed in prelab, the qPCR experiment with your samples was not successful. This is likely due to degraded RNA, which is a very common issue given the prevalence, abundance, and stability of RNases. You will therefore use the data collected by the teaching faculty for this exercise and in your M2 Research Article.

Before you can apply the statistical tools from Part 1 to your data, you must first normalize the p21 expression levels. To account for any unintended biases in RNA purification and / or cDNA preparation, it is important to normalize the expression of the transcript of interest to expression of a housekeeping or constitutive gene. Ideally, the gene to which the data of interest are normalized is not responsive to the treatment tested. In our experiment, we used GAPDH because it is not expected to be responsive to etoposide treatment. How might you confirm this assumption?

Review the data in this spreadsheet.
- The DLD-1 and BRCA2- RNA isolated from untreated and etoposide treated cells was probed using both the p21 and GAPDH primers.
- Each reaction was completed in triplicate. Note: these are technical replicates.
- The data are represented as the 'threshold cycle' C_T or amplification cycle at which SYBR Green fluorescent signal was detected (review M2D4 Introduction).
Normalize p21 expression to GAPDH expression (ΔC_T).
- Subtract the GAPDH C_T value from the p21 C_T value using the appropriate treatment conditions, according to the screenshot below.
Exponentially transform each normalized value to the ΔC_T expression.
- ΔC_T expression = 2^-ΔC_T.
Average the replicates for each treatment, then calculate the 95% CI and t-test p-value.
- With this information, graph your data with error bars and include information concerning any statistical significance.
Are these results consistent with those from the RNA-seq data?
- Load the data:
  - load("~/Desktop/RNA-seq data analysis/preprocessed_data.RData")
  - library("DESeq2")
- Plot the reads for p21 (also called CDKN1A):
  - plotCounts(dds,"CDKN1A", intgroup="group")

Part 3: Analyze cell viability assay data

Review your cell viability results from M2D4 (posted to the Discussion tab of the M2 main page). Use the statistical tools you learned in the above exercises to analyze the pooled class data for your M2 Research Article.

Navigation links

Next day: First day of M3

Previous day: Journal club II

@@ Line 6: / Line 6: @@
 Statistics are mathematical tools used to analyze, interpret, and organize data.  The specific tools that you will use are confidence intervals (CI) and the Student's t-test.  To begin, review the following definitions:
-*mean
+*Mean (or average) is defined as:
-*true mean
+[[Image:Sp17 20.109 M2D9 mean equation.png|thumb|center|500px|]]
-*standard deviation
+*With infinite data, the mean (&chi;<sub>&iota;</sub>) approaches the true mean (&mu;).
+*Standard deviation measures the variation in the data and is defined as:
+[[Image:Sp17 20.109 M2D9 stddev equation.png|thumb|center|500px|]]
+*With infinite data, the standard deviation (''s'') approaches the true standard deviation (&sigma;).
-Confidence intervals...error bars
+Because standard deviation is only justified when sufficient data have been collected to generate a normal curve, you will use confidence intervals to report the likelihood that your results predict the true mean. A confidence interval is a defined interval that is calculated to define the true mean to a specified level of confidence.  Simply, it is possible to define a range in your data set that likely contains the true mean based on the calculated mean.
+*Confidence interval is defined as:
+[[Image:Sp17 20.109 M2D9 CI equation.png |thumb|center|600px|]]
-Student's t-test...p-value
+In your data, you should use the CI to generate error bars due the low ''n''.  Be sure to report which confidence level was used to calculate the intervals reported.  So, what does this all mean in regard to the data you will report?  As an example, if the calculated &chi;<sub>&iota;</sub> of a data set equals 80 au there is a 95% chance the &mu; is between 50 au and 110 au, where au = arbitrary units.  And how does this relate to ''s''?  If you know the &mu;, the &sigma; represents a 68% confidence interval.
+Lastly, you will use Student's t test to report if your data are statistically different between treatments.
+*Student's ''t'' test is defined as:
+[[Image:Sp17 20.109 M2D9 tcalc equation.png|thumb|center|550px|]]
+The value you calculate with the Student's ''t'' test equation is referred to as ''t''<sub>calculated</sub>.  This ''t''<sub>calculated</sub> value is compared to the ''t''<sub>tabulated</sub> value in the the ''t'' table, according to the appropriate ''n'' - 1 using the p-value for the two-tailed distribution (which assumes that you do not know how the data will shift).  If the ''t''<sub>calculated</sub> value is greater than the ''t''<sub>tabulated</sub>, then the data sets are significantly different at the specific p-value.  So, what does this all mean in regard to the data you will report?  As an example, if the ''t''<sub>calculated</sub> for a data set with ''n'' - 1 = 10 is 3 (given that the ''t''<sub>tabulated</sub> is 2.228), then the data sets are different with a p-value &le; 0.05.  Which means that there is less that a 5% chance that the data sets are the same.
 ==Protocols==
 ===Part 1: Practice statistical analysis===
-Review these [[Media: CometAssay_M1D6stats_F14.xlsx | data]] from an experiment where cells were exposed to increasing amounts of radiation. Your goal is to determine if a statistically significant amount of DNA damage was induced. For the purpose of this exercise, the values in the spreadsheet are in arbitrary units of 'DNA damage', where the higher numbers indicate more damage.
+Review [[Media: CometAssay_M1D6stats_F14.xlsx | '''these data''']] from an experiment where cells were exposed to increasing amounts of radiation. Your goal is to determine if a statistically significant amount of DNA damage was induced. For the purpose of this exercise, the values in the spreadsheet are in arbitrary units of 'DNA damage', where the higher numbers indicate more damage.
-When interpreting the statistics, consider how you may use the information to convince someone that the DNA damage was significant? You may find this [[Media: S09_20109_M2D5-Stats-4.xls‎ | spreadsheet]], originally created by Prof. Bevin Engelward and modified by the 20.109 staff, helpful for this exercise. At a minimum, you should post a bar plot of the data with 95% confidence intervals and indicate if there is a statistically significant difference (''i.e.'' provide a ''p''-value) between conditions in your Benchling notebook.
+When interpreting the statistics, consider how you may use the information to convince someone that the DNA damage was significant. You may find this [[Media: S09_20109_M2D5-Stats-4.xls‎ | spreadsheet]], originally created by Prof. Bevin Engelward and modified by the 20.109 staff, helpful for this exercise. At a minimum, you should post a bar plot of the data with 95% confidence intervals and indicate if there is a statistically significant difference (''i.e.'' provide a ''p''-value) between conditions in your Benchling notebook.
 ===Part 2:  Analyze quantitative PCR data===
-As discussed in prelab, the qPCR experiment with your samples was not successful.  This is likely due to degraded RNA, which is a very common issue given the prevalence, abundance, and stability of RNase.  You will therefore use the data collected by the teaching faculty for this exercise and in your M2 Research Article.
+As discussed in prelab, the qPCR experiment with your samples was not successful.  This is likely due to degraded RNA, which is a very common issue given the prevalence, abundance, and stability of RNases.  You will therefore use the data collected by the teaching faculty for this exercise and in your M2 Research Article.
-#Review the data in [[Media:20.109 Spring 2017, M2 qPCR results.xlsx|this document]].
+Before you can apply the statistical tools from Part 1 to your data, you must first normalize the p21 expression levels.  To account for any unintended biases in RNA purification and / or cDNA preparation, it is important to normalize the expression of the transcript of interest to expression of a housekeeping or constitutive gene.  Ideally, the gene to which the data of interest are normalized is not responsive to the treatment tested.  In our experiment, we used GAPDH because it is not expected to be responsive to etoposide treatment.  How might you confirm this assumption?
+#Review the data in [[Media:20.109 Spring 2017, M2 qPCR results v2.xlsx|this spreadsheet]].
 #*The DLD-1 and BRCA2- RNA isolated from untreated and etoposide treated cells was probed using both the p21 and GAPDH primers.
 #*Each reaction was completed in triplicate.  '''Note:''' these are technical replicates.
-#
+#*The data are represented as the 'threshold cycle' C<sub>T</sub> or amplification cycle at which SYBR Green fluorescent signal was detected (review [[20.109(S17):Complete cell survival assay and examine transcript levels in response to DNA damage (Day 4)| M2D4 Introduction]]).
+#Normalize p21 expression to GAPDH expression (ΔC<sub>T</sub>).
+#*Subtract the GAPDH C<sub>T</sub> value from the p21 C<sub>T</sub> value using the appropriate treatment conditions, according to the screenshot below. [[Image:Sp17 20.109 M2D9 qPCR normallization.png|thumb|center|400px|]]
+#Exponentially transform each normalized value to the ΔC<sub>T</sub> expression.
+#*ΔC<sub>T</sub> expression = 2<sup>-ΔC<sub>T</sub></sup>.
+#Average the replicates for each treatment, then calculate the 95% CI and t-test p-value.
+#*With this information, graph your data with error bars and include information concerning any statistical significance.
+#Are these results consistent with those from the RNA-seq data?
+#*Load the data:
+#**<code>load("~/Desktop/RNA-seq data analysis/preprocessed_data.RData")
+#**library("DESeq2")</code>
+#*Plot the reads for p21 (also called CDKN1A):
+#**<code>plotCounts(dds,"CDKN1A", intgroup="group")</code>
 ===Part 3:  Analyze cell viability assay data===
+Review your cell viability results from M2D4 (posted to the [[Talk:20.109(S17):Module 2| Discussion tab of the M2 main page]]).  Use the statistical tools you learned in the above exercises to analyze the pooled class data for your M2 Research Article.
 ==Navigation links==
 Next day: [[20.109(S17):Growth of phage materials (Day1) | First day of M3]]
 Previous day: [[20.109(S17):Journal Club II (Day8)| Journal club II]]

Difference between revisions of "20.109(S17):Examine quantitative PCR results (Day9)"

Latest revision as of 15:44, 12 April 2017

Contents

Introduction

Protocols

Part 1: Practice statistical analysis

Part 2: Analyze quantitative PCR data

Part 3: Analyze cell viability assay data

Navigation links

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools