HW2
Due: 4/25 11:00am EEE Dropbox
Color-based object detection
In this assignment, you will make a pixel-based classifier for different body parts of a moving person. You will use the "color" images from the project video set. The classifier, constructed through Baye's rule, will classify a pixel "x" as foreground if P(x|foregound)P(foreground) > P(x|background)P(background). You will explore different likelihood models for P(x|foreground) and P(x|background) using multivariate Gaussians and histogram models. You will use these models to detect the torso, hair, and legs of the person moving in the video.
Overview: You will be given skeleton code here. The two high level scripts "hw2a.m" and "hw2b.m" contains wrapper scripts for color-based pixel detection, using Guassian vs histogram color models. Your job is to fill in all the referenced functions to allow the given wrapper scripts to execute correctly. You may need to make small modifications to these scripts during the following questions.
User interaction: The wrapper scripts require a user to mask out an object in a given frame, using matlab's "roipoly" function. I have included a mask for the torso of the figure in zipfile, saved as a matlab matfile.
There are three sections to this assignment.
Helper functions
- vectorize.m [10 pts]
This is a helper function that flattens an image into an array of RGB vectors. This will be useful for passing pixels into the fitting functions below that learn color models.
- fitPriors.m [10 pts]
This function will estimate the maximum likelihood estimate (MLE) of the prior probability of a foreground and background labels given a a vector of binary input labels.
Gaussian color models
- logGaussian.m [20 pts]
This function will evaluate the log probability of seeing a collection of pixels under a given Gaussian model (eg, a mean RGB vector and 3x3 covariance matrix). This will be a helper function called by "classifyGaussian.m". You are not allowed to use the matlab function "mvnpdf.m".
- fitGaussian.m [20 pts]
This function will learn a Gaussian color model (eg, a mean RGB vector and a 3x3 covariance matrix) given a collection of vectorized pixels. You are not allowed to use the matlab function "cov.m".
- classifyGaussian.m [20 pts]
This function will classify an image using a foreground and background Gaussian color model, and prior models for foreground and background labels.
Histogram color models
- quantizeIm.m [20 pts]
This function will quantize an image, so that each pixel is labeled with the bin it falls into. If quantizing into 8 bins per color channel, each pixel will be labeled with a number between 1 and 512. This helper function will be called by the bottom two functions.
- fitHistogram.m [20 pts]
This function will learn a discrete probability model given a collection of pixels by counting the fraction of times a given discrete value is observed. By default, this will assign zero probability to a discrete value that is not observed. You can fix this by adding a small value to the count of every value.
- classifyHistogram.m [20 pts]
This function will classify an image using a foreground and background histogram color model, and prior models for foreground and background labels.
What to hand in: Hand in all the completed functions above, complete with comments. Also hand in a pdf with the following figures. To create the figures, select three frames from the video. Your pdf should include figures of these images and their associated pixel classification results.
- Learn color models for the hair, torso, and leg of the person in the video from the first frame. Show results for hair, torso, and leg detection on three remaining images from the video. You will need to use "roipoly.m" to generate label masks for the body parts, as shown in the skeleton code. Show results using a Gaussian color model, a 8-bins-per-channel color histogram model, and a 16-bins-per-channel color histogram model. [10 pts]
- Modify "fitGaussian.m" to "fitGaussianReg.m", so that it regularizes the learned Gaussian models by adding a constant value to the diagonal of the covariance matrices. Pass this constant value as a parameter to this function. Similarly, create a "fitHistogramReg.m" that regularize the color histogram models by adding a constant "pseudocount" to each bin before normalization. Again, pass this constant value as a parameter. Find an optimal setting of these two parameters, and show the 3 images from above (for the Gaussian and two color histogram models). [20 pts]
- Use the first two images for training. Again, use "roipoly.m" to generate the additional labels. Show the above images again, and explain why the results look better. [10 pts]
- [Extra-credit] The binary classification masks will look noisy, with lots of "speckles". Use Matlab's morphology functions ("imopen.m" and "imclose.m") to clean up the classifications by exploiting the fact that nearby pixels should have similar labels. [10 pts]
Hints
- "bsxfun.m" is a MATLAB function for applying operations to each column or row of a matrix. This may prove useful when writing "logGaussian.m".
- "reshape.m" is a MATLAB function for reshaping arrays. This may prove useful when writing "vectorize.m".
- "inv.m" is a MATLAB function for inverting square matrices. This may prove useful when writing "logGaussian.m".
- In "fitHistogram.m", you will need to compute a histogram over pixels. While this can be done with a for loop, MATLAB gurus can try to use the "sparse.m" MATLAB function to do this without a for loop (see the help file).
- All the functions can be implemented with a few lines of MATLAB code. If you are writing lots of code, there is probably a simpler solution (though you will not be penalized for long code).