First type "load singleface" to load into MATLAB an image of a single face stored in the MATLAB file singleface.mat.This is a single monochrome image as we have discussed in class.You can display this (or any other image) using the dispimg.m function which is among the MATLAB files you downloaded. The dispimg.m function takes as input an image, performs some simple scaling (calling the scale.m function), defines the colormap gray, and then displays the image. Try modifying this function so that it uses a different colormap (type "help colormap" to find out what colormaps are available) and see the effect. (You can change back the function to a grayscale when you are done).
Find the maximum, minimum, median, mean, and standard deviation of the pixel intensities in the image. Plot a histogram of the pixel intensities and comment on what it tells you about the image (use the "hist" function). Keep in mind that all of these functions operate on column vectors: so you will need to reshape your image matrix as a column vector first (use "reshape.m").
Now write a simple function which will threshold this image to create a new image such that all pixels in the original image which are brighter than the threshold t are mapped to 1 and all pixels with intensities less than or equal to t are mapped to value 0.
---------------- your code goes here ------------------
» i2up
i2up =
20x4 struct array with fields:
directory
name
image
type
This tells us that "i2up" is a structure array of dimension 20 by 4. Each element of this array has 4 fields. The first dimension of the structure varies from 1 to 20 and corresponds to 20 different people. The second dimension of the array, varying from 1 to 4, corresponds to the specific type of expression (happy, sad, angry, neutral) recorded for each person. The field i2up.type is a string which records what the name of the image type is. You can ignore the directory field. The i2up.name field is a character string containing the name of the person.The i2up.image field is the actual image, a matrix of pixel values. These images are relatively low resolution so you will not be able to see the person's expression very clearly. Thus we have 20 people, and 4 images for each person, 80 images in total.
The "i2up" and the "i2straight" structure have exactly the same people in exactly the same order, with the same expressions, but in "i2up" each person has their head at an upward angle, while in "i2straight" each person has their head at a "straight head-on" angle to the camera.
First we can display individual images. If we call dispimg(i2straight(2,3).image) we will display the image for the 2nd person using the 3rd expression, in the "straight" angle to the camera. "i2straight(2,3).type" will tell us the type of expression for that image. These images are pretty low resolution (about 60 x 60 pixels) so you can actually see the pixel boundaries here. It is informative to grab the corner of the figure window and shrink the figure window on your screen - surprisingly you should be able to see the face "more clearly" as it shrinks. The pixel boundaries distract our brain when they are visible, but when we shrink the figure (and can't be distracted by the pixel boundaries) our brain can recognize more structure in the image (typically more features of the face).
Now we can display whole sets of images. The function dispset2d.m is a simple MATLAB function which takes an array structure of the form described above (e.g., "i2up") and displays a "mosaic" of the individual images on an image grid. If you call "dispset2d(i2straight(1,:));" you will display all of the images for the first person on the list. "dispset2d(i2straight(:,1));" will display the first expression for each person on the list. Note also that this function lays out the mosaic column by column (i.e., columnwise rather than row-wise, in the mosaic). Note that there may be some blank images in the set: you will need to check that each image field contains data, i.e., that it is not just a null matrix.
You are to write a MATLAB function which
will take a
structure
array of images (with exactly the same fields as above, e.g., "i2up"),
find
the k nearest neighbor images to a specified "query image", and display
both the original image and the k nearest neighbor images if "plotflag"
is 1 (the first image that shows up in the displayed result (top
leftmost image in the mosaic), and on the list of indices, should be
the query image itself). The query image is specified by input
arguments i and j, namely
the
(i,j)th element of the structure array. If you wish you can provide some additional visual information to identify the query image (the first image displayed), e.g., by automatically drawing a red box around the image or some other visual cue.
The k nearest neighbor images
are
defined as the k images (including the query image itself) which
are
closest to this query image where "closeness" is measured by Euclidean
distance. Euclidean distance between two matrix images is defined the
same
way as for two vectors, i.e., pixel by pixel differences squared and
summed.
One way to do this is to convert the images to vectors and then just
call
your code from Assignment 2 for finding the k closest vectors to a
given
one. Or you can take the differences of 2 images directly by
subtracting the 2 matrices corresponding to the 2 images, and then
squaring the resulting differences, summing them up and taking the
square root of the sum. [Note, if you don't have working kNN code, or its too slow, feel free to use the code examples provided in slides in class - if that does not work, please feel free to email the TA or the instructor].
Finally, you need to call dispset2d.m from within this function to create your display (or adapt or replicate the necessary code from within dispset2d.m). To get this to work you may need to experiment a bit with the function dispset2d.m.
function [list] =
knndispset(imageset,i,j,k,plotflag)
% function [list] =
knndispset(imageset,i,j,k, plotflag)
%
% a brief description of what the
function
does
% ......
%
Your Name, CS 175, date
%
% Inputs
% imageset:
an
array
structure of images (CS 175 format)
% i, j:
integers
specifying that imageset(i,j).image is the query image
% k: number of
neighbors
to find
% plotflag: display the k
nearest
neighbors if plotflag = 1;
%
% Outputs
% list: a k x 2 matrix,
where
the first row contains the indices from imageset of the nearest
neighbor,
%
the second row contains the indices of the 2nd nearest neighbor, and so
forth.
---------------- your code goes here ------------------
To test this code, you can print out the
indices
(i,j)
for each of the k neighbors that are found: for a query on individual
i,
you should find in almost all cases with this data that the 3 closest
neighbors
are also images from individual i.
As well as returning the specified results above, the code prints out the training accuracy and cross-validation accuracy for each example (to the screen) so that it is easy to see what is going on.
As a simple test of this code, if you use
the data1 and
data2
data sets (data from each of the 2 classes that are in the .mat file
assignment4_simulated_data.mat), and call the
function with k=1 nearest-neighbors you should get approximately
the following
results (there may be differences in how the random number generator
works on different machines, so your cross-validation partitions might
be different to those used in the runs below, and consequentaly the
accuraries reported may be a little different - but they should be
roughly similar).
>>
test_classifiers(data1,data2,1,10,1234);
Training Data Results:
Minimum distance accuracy = 74.00
KNN, k=1, accuracy = 100.00
Cross Validation Results (v=10):
Minimum distance accuracy = 74.00
KNN, k=1, accuracy = 67.00
If now try multiple different values for k, the
results
are
as follows:
>> test_classifiers(data1,data2,[1 3 5 7 9
11 13 15 21 31 51],10,1234);
Training Data Results:
Minimum distance accuracy = 74.00
KNN, k=1, accuracy = 100.00
KNN, k=3, accuracy = 84.00
KNN, k=5, accuracy = 83.50
KNN, k=7, accuracy = 82.00
KNN, k=9, accuracy = 81.50
KNN, k=11, accuracy = 80.00
KNN, k=13, accuracy = 80.50
KNN, k=15, accuracy = 80.00
KNN, k=21, accuracy = 79.00
KNN, k=31, accuracy = 77.50
KNN, k=51, accuracy = 76.50
Cross Validation Results (v=10):
Minimum distance accuracy = 74.00
KNN, k=1, accuracy = 67.00
KNN, k=3, accuracy = 76.00
KNN, k=5, accuracy = 75.00
KNN, k=7, accuracy = 78.00
KNN, k=9, accuracy = 77.00
KNN, k=11, accuracy = 76.00
KNN, k=13, accuracy = 74.50
KNN, k=15, accuracy = 75.50
KNN, k=21, accuracy = 76.00
KNN, k=31, accuracy = 75.50
KNN, k=51, accuracy = 76.50
Note that the results will be sensitive to
the
random
permutation of the data, i.e., different random permutations give
different
train/validation sets, and can give different accuracy estimates. Thus,
if you want to get the same results each time you run the code, you can
use the value rseed to reset the state of
the pseudorandom number generator.
Important: you should verify that this
code is working (producing numbers similar or the same as those above
and running in reasonable time (certainly in less than a minute for the
example with multiple k values above)) before you proceed to Part 4
below.
Note: some of the images are blank. You should not use these blank images for either training the classifier or testing the classifier. I suggest that you simply remove blank images when you do the image to matrix conversion. Note that if you do *not* remove the blank images, the minimum distance classifier in particular will typically give much worse performance (the nearest neighbor classifier is not affected as much in general): can you think why this would be the case for each of these 2 classifiers?
function [cvacc, trainacc] =
test_imageclassifiers(imageset1,imageset2,plotflag,kvalues,v,rseed)
% function [cvacc, trainacc] =
test_imageclassifiers(imageset1,imageset2,plotflag,kvalues,v,rseed)
%
% Learns a classifier to classify images in
imageset1
% from images in imageset2, using minimum distance
and
knn classifiers,
% and returns the training and cross-validation
accuracies.
%
%
Your name, CS 175, date
%
% INPUTS:
% imageset1, imageset2: arrays (of
size m
x n, and m2 x n2)
% of
structures,
where imageset1(i,j).image is a matrix of
% pixel
(image)
values
of size nx by ny. It is assumed
% that all
images
are of the same size in both imageset1
% and
imageset2.
% plotflag: if plotflag=1, plot the
mean
image for each set,
% and plot the difference of the means
of
the images in the two sets.
% kvalues: an K x 1 vector of k values
for
the knn classifier
% v: number of "folds" for v-fold
cross-validation
%
% OUTPUTS:
% cvacc: K+1 x 1 vector of accuracies
estimated
using cross-validation
% trainacc: K+1 x 1 vector of accuracies on
the
training data
% where: (1) accuracy is
expressed
as a percentage, between 0 and 100%
%
(2) K is the number of different k values used for knn (i.e.,
length(kvalues))
%
(3) cvacc(i), trainacc(i), for i=1:K, is the accuracy for knn with k =
kvalues(i)
%
(4) cvacc(K+1, trainacc(K+1), is the accuracy for the minimum distance
classifier
% convert each imageset to a feature
matrix form for
classification
learning
dvector1 = image_to_matrix(imageset1);
dvector2 = image_to_matrix(imageset2);
% now run the various classifiers and
report the
accuracy
results
[cvacc, trainacc] =
test_classifiers(dvector1,dvector2,kvalues,v,rseed);
%
------- end of MATLAB code --------------------
To help you test this function, here is an
example
of
the output produced by my function when called with these arguments.
>>
test_imageclassifiers(rimages,simages,0,3,10,1234)
Training Data Results:
Minimum distance accuracy = 89.47
KNN, k=3, accuracy = 84.21