Difference between revisions of "Assignment 2 Part 1: Noise in images"

From Course Wiki
Jump to: navigation, search
Line 19: Line 19:
  
 
==Probability review==
 
==Probability review==
 
+
The question of what exactly a random variable is can get a bit philosophical. For present purposes, let's just say that a random variable represents the outcome of some stochastic process, and that you have no way to predict what the value of variable will be.  
==Modeling photon emission==
+
[[File:PDF and CDF of MATLAB rand and randn functions.png|thumb|right|Plots of PDF <math>f_x(x)</math> vs. <math>x</math> and CDF <math>F_x(x)</math> vs <math>x</math> for the uniform distribution on the interval (0,1) and the normal distribution with <math>\mu=0</math> and <math>\sigma=1</math>. MATLAB functions <tt>rand</tt> and <tt>randn</tt> return pseudorandom values that follow these distributions.]]
+
In this part of the assignment, you will make a computer model of the process of measuring light in order to examine how noise sources affect images. Photon emission is a stochastic process, so we will start with a short review of (or introduction to) some probability concepts. The question of what exactly is a random variable can get a bit philosophical. For present purposes, let's just say that a random variable represents the outcome of some stochastic process, and that you have no way to predict what value that the variable will take.  
+
  
 
Even though the value of a random variable is not predictable, it is usually the case that some outcomes are more likely than others. The specification of the relative likelihood of each outcome is called a ''Probability Density Function'' (''PDF''), usually written as <math>f_x(x)</math>. Two commonly-used PDFs are shown in the plots on the right. Higher values indicate more likely outcomes.  
 
Even though the value of a random variable is not predictable, it is usually the case that some outcomes are more likely than others. The specification of the relative likelihood of each outcome is called a ''Probability Density Function'' (''PDF''), usually written as <math>f_x(x)</math>. Two commonly-used PDFs are shown in the plots on the right. Higher values indicate more likely outcomes.  
  
The probability that a continuous random variable will take on a particular value like <math>\pi</math> or 0.5 is zero. This is because there are an infinite number of possible outcomes. The chance of getting one exact number out of an infinite set of possible outcomes is equal to one divided by infinity. In other words, there is no chance at all. This is kind of baffling to think about and certainly annoying to work with. It's usually easier to think about the chance that a variable will lie on an interval ''between'' two numbers <math>a</math> and <math>b</math>. This can be found by integrating the PDF from <math>a</math> to <math>b</math>:
+
Perhaps counterintuitively, the PDF doesn't actually tell you in an absolute sense how likely a certain outcome is. The probability that a continuous random variable will take on a particular value such as 0.5 is zero. This is because there are an infinite number of possible outcomes. The chance of getting 0.5 when there are an infinite set of possible outcomes is equal to one divided by infinity. In other words, there is no chance at all of getting exactly 0.5. This is kind of baffling to think about and certainly annoying to work with. It's usually cleaner to think about a random variable falling within a certain interval. The chance that a random variable will fall ''between'' two numbers <math>a</math> and <math>b</math> can be found by integrating the PDF from <math>a</math> to <math>b</math>:
  
 
:<math>Pr(a \leq x \leq b)=\int_a^b f(x)</math>
 
:<math>Pr(a \leq x \leq b)=\int_a^b f(x)</math>
Line 34: Line 31:
 
:<math>F_x(x)=\int_{-\infty}^x f_x(x)</math>
 
:<math>F_x(x)=\int_{-\infty}^x f_x(x)</math>
  
<math>F_x(x)</math> is the probability that <math>x</math> takes on a value less than <math>x</math>. The probability that <math>x</math> lies on the interval <math>[a,b]</math> is <math>:Pr(a \leq x \leq b)=F_X(b)-F_x(a)</math>.
+
<math>F_x(x)</math> is the probability that <math>x</math> takes on a value less than <math>x</math>. The probability that <math>x</math> falls within the interval <math>[a,b]</math> is <math>:Pr(a \leq x \leq b)=F_X(b)-F_x(a)</math>.
  
Two important properties of a PDF are its ''mean'' and ''standard deviation''. Mean is a way to measure of the ''central tendency'' of a distribution and standard deviation is a measure of its ''spread''. The mean is equal to the sum of every possible outcome times the probability of that outcome:  
+
If you know the PDF, you can calculate the expected (or mean) value <math>\mu</math> and variance <math>\sigma^2</math> of a random variable. These are given by:
  
:<math>\mu=\int_{-\infty}^{\infty} x f(x) dx</math>.
+
:<math>\mu=E(X)=\int_{\infty}^{\infty}{x P(x) dx}</math>
 +
:<math>\sigma^2=\int_{−\infty}^{\infty}{(x-\mu)^2 P(x)dx}</math>
  
Another way to think of the mean is like this: if you averaged lots of outcomes  
+
Mean is a way to measure of the ''central tendency'' of a distribution and standard deviation is a measure of its ''spread''. If you evaluated the random variable a large number of times and averaged all the outcomes, the average would tend to approach <math>\mu</math>. Standard deviation (the square root of variance) gives a sense of how far an outcome is likely to lie from the mean.
  
 +
 +
==Modeling photon emission==
 +
[[File:PDF and CDF of MATLAB rand and randn functions.png|thumb|right|Plots of PDF <math>f_x(x)</math> vs. <math>x</math> and CDF <math>F_x(x)</math> vs <math>x</math> for the uniform distribution on the interval (0,1) and the normal distribution with <math>\mu=0</math> and <math>\sigma=1</math>. MATLAB functions <tt>rand</tt> and <tt>randn</tt> return pseudorandom values that follow these distributions.]]
 +
In this part of the assignment, you will make a computer model of the process of measuring light in order to examine how noise sources affect images. Photon emission is a stochastic process, so we will start with a short review of (or introduction to) some probability concepts.
 
It's time to fire up MATLAB. Feel free to use a another language, if you like. The commands are very similar in most languages.
 
It's time to fire up MATLAB. Feel free to use a another language, if you like. The commands are very similar in most languages.
  

Revision as of 23:24, 16 September 2017

20.309: Biological Instrumentation and Measurement

ImageBar 774.jpg


Overview

Simple model of digital image acquisition.png

As you saw in Assignment 1, recording a digital image is essentially an exercise in measuring the intensity of light at numerous points on a grid. Stated mathematically, the task is to measure the spatially-varying magnitude of light intensity on a grid of points in a particular plane. The imaging process is subject to various noise sources. In this part of the lab, you will develop a software model for imaging noise sources and you will use it to explore how noise affects images. In particular, we will be asked to look into what factors determine the Signal to Noise Ratio (SNR) of an image. The amount of information in a signal (such an an image) depends on the SNR.

The figure on the right depicts a (very) simplified model of digital image acquisition. In the diagram, a luminous source stochastically emits $ \bar{N} $ photons per second. A fraction $ F_O $ of the emitted photons lands on a semiconductor detector. Incident photons cause little balls (electrons) to fall out of the detector. The balls fall into a red bucket. At regular intervals, the bucket gets dumped out on to a table where the friendly muppet vampire Count von Count counts them. The process is repeated for each point on a grid. Not shown: the count is multiplied by a gain factor $ G $ before the array of measured values is returned to a recording device (such as a computer).

The following formula represents the imaging process, including gain and noise:

$ P_{x,y} $=G \left I_{x,y} + \epsilon_{x,y} \right</math>,

where <$ x $ and $ y $ are spatial coordinates, $ P_{x,y} $ is the matrix of pixel values returned by the camera , $ G $ is the gain of the camera, $ I_{x,y} $ is the actual light intensity in the plane of the detector, and $ \epsilon_{x,y} $ is a matrix of error terms that represent noise introduced in the imaging process.

Before we go on to the computer model, it will be helpful to indulge in with a short review of (or introduction to) probability concepts.

Probability review

The question of what exactly a random variable is can get a bit philosophical. For present purposes, let's just say that a random variable represents the outcome of some stochastic process, and that you have no way to predict what the value of variable will be.

Even though the value of a random variable is not predictable, it is usually the case that some outcomes are more likely than others. The specification of the relative likelihood of each outcome is called a Probability Density Function (PDF), usually written as $ f_x(x) $. Two commonly-used PDFs are shown in the plots on the right. Higher values indicate more likely outcomes.

Perhaps counterintuitively, the PDF doesn't actually tell you in an absolute sense how likely a certain outcome is. The probability that a continuous random variable will take on a particular value such as 0.5 is zero. This is because there are an infinite number of possible outcomes. The chance of getting 0.5 when there are an infinite set of possible outcomes is equal to one divided by infinity. In other words, there is no chance at all of getting exactly 0.5. This is kind of baffling to think about and certainly annoying to work with. It's usually cleaner to think about a random variable falling within a certain interval. The chance that a random variable will fall between two numbers $ a $ and $ b $ can be found by integrating the PDF from $ a $ to $ b $:

$ Pr(a \leq x \leq b)=\int_a^b f(x) $

Since probability calculations so frequently use the integral of the PDF, whoever decides these things defined a function called the Cumulative Distribution Function $ F_X(x) $ that is equal to the integral of the PDF from $ -\infty $ to x:

$ F_x(x)=\int_{-\infty}^x f_x(x) $

$ F_x(x) $ is the probability that $ x $ takes on a value less than $ x $. The probability that $ x $ falls within the interval $ [a,b] $ is $ :Pr(a \leq x \leq b)=F_X(b)-F_x(a) $.

If you know the PDF, you can calculate the expected (or mean) value $ \mu $ and variance $ \sigma^2 $ of a random variable. These are given by:

$ \mu=E(X)=\int_{−\infty}^{\infty}{x P(x) dx} $
$ \sigma^2=\int_{−\infty}^{\infty}{(x-\mu)^2 P(x)dx} $

Mean is a way to measure of the central tendency of a distribution and standard deviation is a measure of its spread. If you evaluated the random variable a large number of times and averaged all the outcomes, the average would tend to approach $ \mu $. Standard deviation (the square root of variance) gives a sense of how far an outcome is likely to lie from the mean.


Modeling photon emission

Plots of PDF $ f_x(x) $ vs. $ x $ and CDF $ F_x(x) $ vs $ x $ for the uniform distribution on the interval (0,1) and the normal distribution with $ \mu=0 $ and $ \sigma=1 $. MATLAB functions rand and randn return pseudorandom values that follow these distributions.

In this part of the assignment, you will make a computer model of the process of measuring light in order to examine how noise sources affect images. Photon emission is a stochastic process, so we will start with a short review of (or introduction to) some probability concepts. It's time to fire up MATLAB. Feel free to use a another language, if you like. The commands are very similar in most languages.

We are going to model a stochastic process, so the first thing we need is a source of random numbers. MATLAB includes several functions for generating random numbers. We will to make our first model using the function rand(), which returns a random value that follows a uniform distribution in the interval (0, 1). Go ahead and type rand() at the command line. Matlab will return a number between 0 and 1. Do it a few times — it's kind of fun. Can you guess what number will come next? If you can, please come see me. If you want more than one random number, you can pass that information to rand with arguments. For example, try the code snippet below to generate and display (as an image) a matrix with 492 rows x 656 columns of numbers between 0 and 1:

totallyNoisyImage = rand( 492, 656 );
figure
imshow( totallyNoisyImage );
figure
hist( totallyNoiseImage(:) );


Back to 20.309 Main Page