# Assignment 4 part 2: Measure resolution

20.309: Biological Instrumentation and Measurement

This is part 2 of Assignment 4.

## Measuring resolution

### Optical resolution overview

Synthetic image of tiny microspheres used for measuring resolution.

One of the most commonly used definitions of the resolution limit $R$ of an optical system is the distance between two point sources in the sample plane such that the peak of one source’s image falls on the first zero of the other source’s image. This particular definition is called the Rayleigh resolution.

The theoretical value of $R$ is given by the formula

$R=\frac{0.61 \lambda}{ \text{NA}}$,

where $\lambda$ is the wavelength of light that forms the image, and NA is the numerical aperture of the optical system. The definition suggests a procedure for measuring resolution: make an image of a point source, measure the peak-to-trough distance in the image plane, and divide by the magnification. In this part of the lab, you will use a procedure inspired by this simple idea to estimate the resolution of your microscope. Instead of measuring the spot sizes with a ruler, you will use nonlinear regression to find best-fit parameters of a two dimensional Gaussian function that best approximates the digital images of (near) point sources that you will make. You will use the best-fit parameters from the regression to compute the resolution measurement.

One practical problem with this method is that true point sources are difficult to come by. If you were an astronomer testing a telescope, stars are readily available in the night sky, and they are very good approximations of point sources. Since there is no natural microscopic sample that is equivalent to the night sky, microscopists have to prepare a synthetic sample suitable for measuring resolution. Prehaps the most common method is to use a microscope slide sprinkled with tiny, fluorescent beads that have diameters in the range of 100-190 nm. These beads are small enough to be considered point sources. Unfortunately, beads small enough for this purpose are not very bright. Imaging them can be challenging. Your microscope must be very well aligned to get good results. (The images you make for this part of the lab will probably remind you of telescope images. If they don't, have an instructor take a look at your setup.)

Why fit a Gaussian instead of a Bessel function of an Airy disk? Gaussians are more amenable to nonlinear regression because they are smoother and faster to evaluate than Bessel functions. In addition, the Gaussian is a very good approximation to the central bump of a Bessel function. It is straightforward to convert the Gaussian parameters to Rayleigh resolution. See Converting Gaussian fit to Rayleigh resolution for a discussion of the conversion.

In outline, the sequence of steps for the resolution measurement is:

1. get a (real or synthetic) image of point sources
2. find the pixels that correspond to each microsphere and associate them into connected regions
3. compute useful properties of the connected regions
4. eliminate regions that are likely not images of a single microspheres
5. use nonlinear regression to fit a Gaussian model function to the image of each microsphere
6. compute summary statistics
7. convert Gaussian parameter to Rayleigh resolution.

### Identifying bright regions in an image and computing their properties

The first thing we want to do is get a (real or synthetic) image of point sources. Luckily, we've already generated a synthetic image of 90 nm radius spheres in the part 1 of this assignment.

The next step is to identify pixels of interest. Fortunately, this is a very simple matter because we went to great lengths to make an image that has high contrast, little background, and high SNR. The interesting pixels are the bright ones. A simple, global threshold works well — all the pixels brighter than a certain threshold are (probably) interesting. You might want to think a bit about how to choose the best threshold value. The MATLAB Image Processing Toolbox includes a global threshold function called im2bw( I, level ) which applies a global threshold to image I and returns a binary image (also called a bilevel image or a mask). The binary image is a matrix the same size as I that contains only ones and zeros. You guessed it — there are ones in locations where the pixel value was greater than level and zeroes everywhere else.

The function regionprops operates on binary images. It can identify connected regions and compute properties of those regions. The function FindBrightObjectsInImage below uses im2bw and regionprops to segment an image and compute properties of each connected region of pixels. Region properties to compute (e.g. area, eccentricity) are specified in a cell array of strings. Here is a complete list of the properties regionprops can compute.

function RegionProperties = FindBrightObjectsInImage( ...

RegionProperties = regionprops( mask, InputImage, PropertyList );

end


FindBrightObjectsInImage returns a struct array with all of the computed properties — an array where each element is a structure. The structure will contain one field for each of the properties in the PropertyList argument. The fields have the same name as the property. For example, if you included 'Centroid' in the property list and there were ten objects in the image, RegionProperties(3).Centroid would return a 1 x 2 matrix with the y and x coordinates of the third region's centroid. If you wanted to create an N x 2 matrix of all of the centroids, you can use MATLAB's : indexing syntax and a concatenation function: AllCentroids = vertcat( RegionProperties(:).Centroid );.

Try running FindBrightObjectsInImage on your synthetic image and examine the results. Put a breakpoint on the first line of FindBrightObjectsInImage. Use imshow to see how the mask is affected by imclearborder and imdilate. What is a good value for DilationRadius?

## Using nonlinear regression to measure resolution

The task of measuring resolution would be super-simple if regionprops had a built-in property called RayleighResolution. Unfortunately, it doesn't, so we will have to write our own function. Nonlinear regression is a good tool for this sort of thing. As long as we have a mathematical model of what the images of beads look like, we can use nonlinear regression to find best-fit parameters for each of the PSF beads in an image.

### Review

Regression is a method for finding a relationship between a dependent quantity $O_n$ and one or more independent variables, in this case the spatial coordinates of the image $x_n$ and $y_n$. The relationship is described by a model function $f(\beta, x_n, y_n)$, where $\beta$ is a vector of model parameters. The dependent variable $O_n$ is measured in the presence of random noise, which is represented mathematically by a random variable $\epsilon_n$. In equation form:

$O_n=f(\beta, x_n, y_n)+\epsilon_n$.

The goal of regression is to determine a set of best-fit model parameters $\hat{\beta}$ so that $f(\hat{\beta}, x_n, y_n)$ is as close as possible to your data $O_n$. Because the dependent variable includes noise, $\hat{\beta}$ cannot be determined exactly from the data. Increasing the number of observations or decreasing the magnitude of the noise tends to produce a more reliable estimate of $\hat{\beta}$.

Linear and nonlinear regression are similar in some aspects, but the two techniques have a fundamental difference. Linear regression applies to any model functions that are polynomials, Nonlinear regression applies to essentially everything else. Nonlinear regression is obviously more flexible (since you can fit functions like sines, cosines, exponentials, and gaussians), but it unfortunately cannot be reduced to a single, deterministic formula like linear regression can. Finding the optimal solution to a nonlinear regression is an iterative process. Starting with an initial guess, each following iteration produces a more refined estimate of $\beta$. The process stops when no better estimate can be found (or when something bad happens ... such as the solution not converging).

Ordinary nonlinear least squares regression assumes that:

• the independent variables are known exactly, with zero noise,
• the error values are independent and identically distributed,
• the distribution of the error terms has a mean value of zero,
• the independent variable covers a range adequate to define all the model parameters, and
• the model function exactly relates $O$ to $x$ and $y$.

These assumptions are almost never perfectly met in practice. It is important to consider how badly the regression assumptions have been violated when assessing the results of a regression.

### The four things you need for nonlinear regression

As shown in the diagram below, you need four things to run a regression:

1. a matrix containing the values of the independent variable(s);
2. a vector containing the corresponding observed values of the dependent variable;
3. a model function; and
4. a vector of initial guesses for the model parameters.
Block diagram of nonlinear regression.

You must provide these four inputs when running a nonlinear fit. The MATLAB function nlinfit will then do the rest of the work for you. It will take your four inputs, calculate the sum of squared residuals, and then continue iterating and tweaking the $\beta$ parameters until the minimum squared residuals are reached. The output of nlinfit will be the updated parameters $\hat{\beta}$, that when input to your model function, will most closely match the your measured observations.

Here is a MATLAB function that you may use to fit your resolution data:

function BestFitParameters = Fit2dGaussian( Values, Coordinates, PlotEnable )
if( nargin < 3 )
PlotEnable = false;
end

pixelCountAboveHalf = sum( Values > ( ( min( Values ) + max( Values ) ) / 2 ) );
sigmaInitialGuess = 0.8 * sqrt( pixelCountAboveHalf / 2 / pi / log(2) );

initialGuesses = [ ...
mean( Coordinates(:, 1) ), ... % yCenter
mean( Coordinates(:, 2) ), ... % xCenter
range( Values ), ... % amplitude
sigmaInitialGuess, ... % sigma
min( Values ) ]; % offset

if( PlotEnable )
plot3( Coordinates(:,1), Coordinates(:,2), Gaussian2DFitFunction( initialGuesses, Coordinates ), 'x' )
hold on
plot3( Coordinates(:,1), Coordinates(:,2), Values, 'x' )
drawnow;
end

BestFitParameters = nlinfit( Coordinates, Values, @Gaussian2DFitFunction, initialGuesses );
end



This code relies on the following model function:

function out = Gaussian2DFitFunction( Parameters, Coordinates )
yCenter = Parameters(1);
xCenter = Parameters(2);
amplitude = Parameters(3);
sigma = Parameters(4);
offset = Parameters(5);

out = amplitude * ...
exp( -(( Coordinates(:, 1) - yCenter ).^2 + ( Coordinates(:, 2) - xCenter ).^2 ) ...
./ (2 * sigma .^ 2 )) + offset;

end


Whenever you run a nonlinear regression, it is critical that you provide it with the appropriate inputs. Giving nlinfit bad inputs is sure to lead to obscure and confusing errors. So let's spend some time to understand what this function Fit2dGaussian does and how it handles each each of the four required inputs.

#### #3. The model function

nlinfit requires that the regression model be expressed as a function that takes two arguments and returns a single vector of predicted values. The model function must have the form:

[ PredictedValues ] = ModelFunction( Beta, X )


The first argument, Beta, is a vector of model parameters. The second argument, X, is a vector of independent variable values. The return value, PredictedValues, must have the same size as X.

The MATLAB function Gaussian2DFitFunction defined above computes the two dimensional Gaussian function that we will use to model the image of a PSF bead. Parameters is a 1x5 vector that contains the model parameters in this order: Y center, X center, amplitude, sigma, and offset.

It's a good idea to test the model function out before you use it. The plot below shows four sets of curves generated by Gaussian2DFitFunction with different parameters. It's comforting to see that the curves have the expected shape.

Model function evaluated with different parameters.

As an example of how to use the model function, let's make a grid from -10 to 10 in x and y, and pick some arbitrary parameters to plot:

figure;
x = -10:10;
y = -10:10;
[X,Y] = meshgrid(x,y);
parametersToTry = [0, 0, 2, 5, 1.3];
Z = Gaussian2DFitFunction(parametersToTry, [X(:),Y(:)]);
plot3(X(:),Y(:),Z,'x')


Try changing the parameters and seeing how the plotted gaussian changes. Does it behave like you expect?

#### 1. and 2. Independent variable and observations

Surface plot of dataset for one PSF microsphere. The x and y coordinates are the independent variables and the pixel value is the dependent variable.

The Fit2dGaussian function has three inputs. Values is a vector of the dependent variable, Coordinates are the 2-D independent variables, and PlotEnable is an optional input that lets you turn on or off plotting. (the default is false)

In the regression we are about to do, the independent variables are x and y, the coordinates of each pixel in the image, and the dependent variables are the corresponding pixel values (intensities) of each pixel. One important consideration is that the regression algorithm assumes that the independent variable has no (or very little) noise. Happily, the pixel coordinates are known with essentially zero error (the layout of pixels on the camera is extremely accurate), which means our regression will not violate this assumption. The pixel values, however, are subject to imaging noise. Considering all that, x and y are a good choice to use as independent variables and the pixel value will be the dependent variable.

The regionprops function can compute two properties that will be useful for the regression: PixelList and PixelValues. These properties exactly correspond to the regression variables. PixelList is an N x 2 matrix that contains the y and x coordinates of each pixel in the region. PixelValues is a vector of all the corresponding pixel values (listed in the same order as the coordinates).

#### 4. Initial guesses

nlinfit requires an initial value for each of the five model parameters, contained in a 1x5 vector. (nlinfit infers the number of model parameters from the size of the Initial guess vector.) regionprops can calculate several properties that are useful for coming up with initial guesses. For example, Centroid is a good starting point for the center of the particle. nlinfit will refine the value as it works. The minimum pixel value is a good guess for offset, and the difference between the maximum and minimum (range) pixel values is a good guess for amplitude. The number of pixels brighter than half of the range is the basis for a guess at sigma.

Notice how the initial parameters are estimated in Fit2dGaussian (repeated below):

    pixelCountAboveHalf = sum( Values > ( ( min( Values ) + max( Values ) ) / 2 ) );
sigmaInitialGuess = 0.8 * sqrt( pixelCountAboveHalf / 2 / pi / log(2) );

initialGuesses = [ ...
mean( Coordinates(:, 1) ), ... % yCenter
mean( Coordinates(:, 2) ), ... % xCenter
range( Values ), ... % amplitude
sigmaInitialGuess, ... % sigma
min( Values ) ]; % offset


#### Testing the code

The first step of all regressions is to plot the observations and the model function evaluated with the initial guesses versus the independent variable on a single set of axes. Don't attempt to run nlinfit until you've done this plot. It is much easier to ensure that the arguments to nlinfit are plausible before you invoke it than to debug a screen full of cryptic, red text afterwards. Side effects of premature regression include confusion, waste of time, fatigue, irritability, alopecia, and feelings of frustration. Contact your professor if your regression lasts more than four hours. There is no chance that nlinfit will succeed if there is a problem with one of its arguments.

For example, you can test out the Fit2dGaussian function on the 2-D Gaussian that you just made above.

beta = Fit2dGaussian( Z(:), [X(:), Y(:)], true );

Pre-regression plot with observed values and model function evaluated with initial-guess parameters.

Notice that the data and the model evaluated at the initial guesses are pretty close to each other. If they didn't look remotely similar, you would need to adjust the initial parameters that you choose. nlinfit will output nonsense if you do not give it a good place to start.

You may also try testing our fitting function using one of the bright regions from your PSF image.

objectProperties = FindBrightObjectsInImage( psfImage, 0.5, 4, { 'Centroid', 'PixelList', 'PixelValues' } );
beta = Fit2dGaussian( objectProperties(1).PixelValues, objectProperties(1).PixelList, true );


Once you know that your initial guesses are good, you can be much more confident that nlinfit will output credible results.

## Repeat the process for each bright region in your image

Now that you deeply understand all that goes into nonlinear regression, repeat the process for each of the bright regions in your image. Our final value for our measured resolution will be an average of all of the airy disc sizes in our image. The function below is mostly fleshed out for you, but you may want to make a few changes and additions.

function [ Resolution, StandardError, BestFitData ] = MeasureResolutionFromPsfImage( ImageData )
% TODO list:
% 1. think of a good way to pick the threshold
% 2. figure out how to eliminate images that are not single beads

objectProperties = FindBrightObjectsInImage( ImageData, 0.5, 2, { 'Centroid', 'PixelList', 'PixelValues' } );

figure(1);
imshow( ImageData / max( ImageData(:) );
LabelObjectsInImage( objectProperties );

% INSERT CODE TO ELIMINATE BAD OBJECTS HERE

BestFitData = zeros( numel(objectProperties), 5);

figure(2);

% use nlinfit to fit a Gaussian to each object
for ii = 1:length(objectProperties)

BestFitData(ii, :) = Fit2dGaussian( objectProperties(ii).PixelValues, objectProperties(ii).PixelList );

% plot data, initial guess, and fit for each peak
figure(2)
clf

% generate a triangle mesh from the best fit solution found by
% nlinfit and plot it
gd = delaunay( objectProperties(ii).PixelList(:,1), ...
objectProperties(ii).PixelList(:,2) );
trimesh( gd, objectProperties(ii).PixelList(:,1), ...
objectProperties(ii).PixelList(:,2), ...
Gaussian2DFitFunction(BestFitData(ii, :), ...
objectProperties(ii).PixelList ) )
hold on

% plot image data
plot3( objectProperties(ii).PixelList(:,1), ...
objectProperties(ii).PixelList(:,2), ...
objectProperties(ii).PixelValues, 'gx', 'LineWidth', 3)
title(['Image data vs. Best Fit for Object Number ' num2str(ii)]);
drawnow
end

Resolution = mean( BestFitData(:,4) ) ./ .336;
StandardError = std( BestFitData(:,4) ./ .336 ) ./ sqrt( size( BestFitData, 1 ) );
end

function out = Gaussian2DFitFunction( Parameters, Coordinates )
yCenter = Parameters(1);
xCenter = Parameters(2);
amplitude = Parameters(3);
sigma = Parameters(4);
offset = Parameters(5);

out = amplitude * ...
exp( -(( Coordinates(:, 1) - yCenter ).^2 + ( Coordinates(:, 2) - xCenter ).^2 ) ...
./ (2 * sigma .^ 2 )) + offset;

end

function LabelObjectsInImage( objectProperties )
labelShift = -9;
fontSize = 10;

for ii = 1:length(objectProperties)
unweightedCentroid = objectProperties(ii).Centroid;
text(unweightedCentroid(1) + labelShift, unweightedCentroid(2), ...
num2str(ii), 'FontSize', fontSize, 'HorizontalAlignment', ...
'Right', 'Color', [0 1 0]);
end

end


It's frequently the case that a few of the beads in a real PSF image are not good candidates for measuring resolution. For example, there are sometimes two beads that are too close together to separate. Sometimes, there are also aggregates of multiple beads in the picture. Identify some useful properties that regionprops computes for sorting out the bad regions and write code to eliminate them.

A bit of MATLAB syntax that you might find useful is this: you can remove an element from an array by assigning it to be the empty value []. Some examples:

RegionProperties(3) = []; % removes the third element of the struct array and reduces its size by 1
RegionProperties( [ 0 0 0 1 0 1 0 0 1 0 ] ) = []; % removes the 4th, 6th, and 9th elements and reduces size by 3


## Testing the code

Example image processing on PSF beads to determine microscope resolution.
Example Gaussian fit of a PSF bead fluorescence emission profile to estimate microscope resolution.

 Use the synthetic image code you developed in part 1 of this assignment to test the MeasureResolutionFromPsfImage function using synthetic images of fluorescent microspheres with a diameter of 180 nm over a range of numerical apertures from 0.1 to 1.0. Plot the results, measured resolution versus predicted resolution. Turn in your code and the plot.

## Measure the resolution of your microscope

1. Make an image of a sample of 170 nm fluorescent beads with the 40X objective. (Several dozens to hundreds of PSF spheres should be captured in your image.)
• Use 12-bit mode on the camera and make sure to save the image in a format that preserves all 12 bits.
• Ensure that the image is exposed properly.
• Over-exposed images will give inaccurate results.
• Under-exposed images will be difficult to process and yield noisy results.
• This procedure is extremely sensitive to the focus adjustment.
• To minimize photobleaching, do not expose of the beads to the light source and longer than necessary.
• Be sure to save the image and the histogram for your lab report.
2. Use image processing functions to locate non-overlapping, single beads in the image.
3. Use nonlinear regression to fit a Gaussian to each bead image.
4. Convert the Gaussian parameters to resolution.

 Report the resolution you measured and discuss sources of error in the measurement.