Difference between revisions of "Assignment 2 Part 1: Noise in images"

Revision as of 17:25, 17 September 2017

20.309: Biological Instrumentation and Measurement

Overview

The process of recording a digital image is essentially an exercise in measuring the intensity of light at numerous points on a grid. Like most physical measurements, imaging is subject to measurement error. The formula below is a mathematical model of the imaging process, including noise and the camera's gain constant:

$ P_{x,y} =G \left( I_{x,y} + \epsilon_{x,y} \right) $,

where $ x $ and $ y $ are spatial coordinates, $ P_{x,y} $ is the matrix of pixel values returned by the camera , $ G $ is the gain of the camera, $ I_{x,y} $ is the actual light intensity in the plane of the detector, and $ \epsilon_{x,y} $ is a matrix of error terms that represent noise introduced in the imaging process.

Count von Count counting electrons.

The amount of information in any signal, such as an image or a neural recording or a stock price, depends on the Signal to Noise Ratio (SNR). In this part of the assignment, you will develop a software model for imaging noise sources and you will use your model to explore how different noise sources affect image SNR.

A (very) simplified model of digital image acquisition is shown in the image on the right. In the diagram, a luminous source stochastically emits $ \bar{N} $ photons per second. A fraction of the emitted photons lands on a semiconductor detector. Incident photons cause little balls (electrons) to fall out of the detector. The balls fall into a red bucket. At regular intervals, the bucket gets dumped out on to a table where the friendly muppet vampire Count von Count counts them. The process is repeated for each point on a grid. Not shown: the count is multiplied by a gain factor $ G $ before the array of measured values is returned to a recording device (such as a computer).

Before we dive into the computer model, it may be helpful to indulge in with a short review of (or introduction to) probability concepts.

Probability review

Random variable distributions

Plots of PDF $ f_x(x) $ vs. $ x $ and CDF $ F_x(x) $ vs $ x $ for the uniform distribution on the interval (0,1) and the normal distribution with $ \mu=0 $ and $ \sigma=1 $. MATLAB functions rand and randn return pseudorandom values that follow these distributions.

The question of what exactly a random variable is can get a bit philosophical. For present purposes, let's just say that a random variable represents the outcome of some stochastic process, and that you have no way to predict what the value of variable will be.

Even though the value of a random variable is not predictable, it is usually the case that some outcomes are more likely than others. The specification of the relative likelihood of each outcome is called a Probability Density Function (PDF), usually written as $ f_x(x) $. Two commonly-used PDFs are shown in the plots on the right. Larger values of the PDF indicate more likely outcomes.

Perhaps counterintuitively, the PDF doesn't really tell you how likely a certain outcome is. The probability that a continuous random variable will take on a particular value such as 0.5 is zero. This is because there are an infinite number of possible outcomes. The chance of getting 0.5 when there are an infinite set of possible outcomes is equal to one divided by infinity. In other words, there is no chance at all of getting exactly 0.5. This is kind of baffling to think about and certainly annoying to work with. It's usually cleaner to think about a random variable falling within a certain interval. The chance that a random variable will fall between two numbers $ a $ and $ b $ can be found by integrating the PDF from $ a $ to $ b $:

$ Pr(a \leq x \leq b)=\int_a^b f(x) $

Since probability calculations so frequently use the integral of the PDF, whoever decides these things defined a function called the Cumulative Distribution Function $ F_X(x) $ that is equal to the integral of the PDF from $ -\infty $ to $ x $:

$ F_x(x)=\int_{-\infty}^x f_x(\tau) d\tau $

$ F_x(x) $ is the probability that $ x $ takes on a value less than $ x $. The CDF makes it easy to calculate probability that $ x $ falls within the interval $ [a,b] $: $ :Pr(a \leq x \leq b)=F_X(b)-F_x(a) $.

Measures of central tendency and dispersion

If you know the PDF, you can calculate the expected (or mean) value $ \mu $ and variance $ \sigma^2 $ of a random variable. These are given by:

$ \mu=E(X)=\int_{−\infty}^{\infty}{x P(x) dx} $

$ \sigma^2=\int_{−\infty}^{\infty}{(x-\mu)^2 P(x)dx} $

Mean is a way to measure of the central tendency of a distribution and standard deviation is a measure of its dispersion or spread. If you evaluated the random variable a large number of times and averaged all the outcomes, the average would tend to approach $ \mu $. Standard deviation (the square root of variance) gives a sense of how far an outcome is likely to lie from the mean.

Calculations with random variables

If you add two random variables, their means add. Their variances also add. Standard deviations add under a big 'ol square root.

$ \mu_{x+y}=\mu_x+\mu_y $

$ \sigma_{x+y}^2=\sigma_x^2+\sigma_y^2 $

$ \sigma_{x+y}=\sqrt{\sigma_x^2+\sigma_y^2} $

Normal distribution

Plot of a normal distribution. Each band has a width of 1 standard deviation

The normal distribution (also frequently called a Gaussian distribution or a bell curve is specified by two parameters, an average value $ \mu $, and a standard deviation $ \sigma $:

$ f(x \; | \; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2} } \; e^{ -\frac{(x-\mu)^2}{2\sigma^2} } $

In the normal distribution, about 68% of outcomes fall within the interval $ \mu \pm \sigma $, about 95% in the interval $ \mu \pm 2 \sigma $, and about 99.7% in the interval $ \mu \pm 3 \sigma $

Central Limit Theorem

If you add together a bunch of independent random variables, their sum tends to look like a normal distribution

Modeling photon emission

Let's start by modeling a single pixel. The first thing we need is a source of random numbers, so fire up MATLAB or any other computing you like on your computer (and stop whining about it). MATLAB includes several functions for generating random numbers. We will make our first model using the function rand(), which returns a random value that follows a uniform distribution in the interval (0, 1). Go ahead and type rand() at the command line. Matlab will return a number between 0 and 1. Do it a few times — it's kind of fun. You get a different number every single time. Can you guess what number will come next? If you can, please come see me.

If you want a bunch of random numbers, rand can help you out with that that, too. The function can take two arguments that specific the size of a matrix of random numbers it will return. rand( <M>, <N> ) generates an M row by N column result. The code snippet below to generate and display (as an image) a matrix with 492 rows x 656 columns of numbers between 0 and 1:

noiseImage = rand( 492, 656 );
figure
imshow( noiseImage );

That's what pure, uniformly-distributed noise looks like! In the Olden Days, when people used to watch things on analog television sets, it happened pretty frequently that the set would be tuned to a station where there was no signal being broadcast. In a desperate attempt to find a signal, old-school television sets would frantically turn up the gain on their internal amplifiers to the maximum. When this happened, tiny, electronic noise signals in the amplifier were amplified and displayed on the screen. You can simulate this pretty well in MATLAB:

figure
for ii = 1:100
    noiseImage = rand( 492, 656 );
    imshow( noiseImage )
    drawnow
end

This code snippet illustrates a couple of concepts you will find useful : how to code a for loop, and how to force a figure to update. The mysterious drawn command causes MATLAB to drop everything and finish all of its deferred tasks such as drawing on the screen. Tye leaving the drawnow out and see what happens. MATLAB decides that it has too much work to do to update the screen. If you want to make the simulation even better, add the command soundsc( randn( 1, 100000 ) ) to the beginning of your script.

Okay, enough goofing around with the random number generator. Assume a single molecule has a 0.01% chance of emitting a photon per microsecond. To model photon emission, you can use the rand function like this: rand() < 0.0001. You are trying to measure the light coming from the molecule, so you decide to count how many photons are emitted over the course of a second. On average, you would expect to measure about 100 photons per second. Just due to chance, the number of photons emitted may be larger or smaller than 100. Let's run a simulation:

close all
numberOfPhotonsEmitted = zeros( 1, 100 );
for ii = 1:100
    numberOfPhotonsEmitted(ii) = sum( ( rand( 1, 1E6 ) < 0.0001 ) );  % number of photons emitted
end

figure
plot( numberOfPhotonsEmitted, 'x' )
axis( [ 1 100 0 130 ] )
xlabel( 'n' )
ylabel( 'Number of Photons' )
title( 'Number of Photons Emitted (N=100 siumulations)' )

Do you get how the simulation works? For each iteration, MATLAB generates a million uniformly distributed random numbers on the interval [0,1]. The less than operator returns a 1 for numbers smaller than 0.0001 and 0 for all the rest, so the probability of getting a 1 is 0.01%. The sum command adds up all the ones to determine the number of photons emitted.

Overview
Part 1: Noise in images
Part 2: Fluorescence microscopy
Part 3: Build an epi-illuminator for your microscope

Back to 20.309 Main Page

Difference between revisions of "Assignment 2 Part 1: Noise in images"

Revision as of 17:25, 17 September 2017

Overview

Probability review

Random variable distributions

Measures of central tendency and dispersion

Calculations with random variables

Normal distribution

Central Limit Theorem

Modeling photon emission

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools