Assignment 4:  CS 175: Project in Artificial Intelligence

Due at 9:30am Tuesday October 30th 2006


Instructions for Assignment 4

In this assignment you will begin to work with images of faces and learn how to display them. You will use your knn classifier from Assignment 2 to classify faces into two classes. First download the zip file assignment4.zip which contains both MATLAB .m functions and MATLAB .mat files containing images.

Part 1: Displaying A Face Image

First type "load singleface" to load into MATLAB an  image of a single face stored in the MATLAB file singleface.mat.This is a single monochrome image as we have discussed in class.You can display this (or any other image) using the dispimg.m function which is among the MATLAB files you downloaded. The dispimg.m function takes as input an image, performs some simple scaling (calling the scale.m function), defines the colormap gray, and then displays the image. Try modifying this function so that it uses a different colormap (type "help colormap" to find out what colormaps are available) and see the effect. (You can change back the function to a grayscale when you are done).

Find the maximum, minimum, median, mean, and standard deviation of the pixel intensities in the image. Plot a histogram of the pixel intensities and comment on what it tells you about the image (use the "hist" function). Keep in mind that all of these functions operate on column vectors: so you will need to reshape your image matrix as a column vector first (use "reshape.m").

Now write a simple  function which will threshold this image to create a new image such that all pixels in the original image which are brighter than the threshold t are mapped to 1 and all pixels with intensities less than or equal to t are mapped to value 0.

Display thresholded images for 3 different values of t: (1) t = mean (average) pixel intensity of the original image, (2) t = 0.2, (3)  t=0.7. Briefly discuss the use of thresholding as a way to detect objects in images (e.g., what are the advantages and what are the potential limitations of this approach).
 
 
 

Part 2: Displaying Sets of Faces

Now load the files i2up.mat and i2straight.mat into MATLAB, and you will get two arrays of data structures of size 20 x 4, one called i2up and one called i2straight.

» i2up
i2up =
20x4 struct array with fields:
    directory
    name
    image
    type

This tells us that "i2up" is a structure array of dimension 20 by 4. Each element of this array has 4 fields. The first dimension of the structure varies from 1 to 20 and corresponds to 20 different people. The second dimension of the array, varying from 1 to 4, corresponds to the specific type of  expression (happy, sad, angry, neutral) recorded for each person. The field i2up.type is a string which records what the name of the image type is. You can ignore the directory field. The i2up.name field is a character string containing the name of the person.The i2up.image field is the actual image, a matrix of pixel values. These images are relatively low resolution so you will not be able to see the person's expression very clearly. Thus we have 20 people, and 4 images for each person, 80 images in total.  

The "i2up" and the "i2straight" structure have exactly the same people in exactly the same order, with the same expressions, but in "i2up" each person has their head at an upward angle, while in "i2straight" each person has their head at a "straight head-on" angle to the camera.

First we can display individual images. If we call dispimg(i2straight(2,3).image) we will display the image for the 2nd person using the 3rd expression, in the "straight" angle to the camera.  "i2straight(2,3).type" will tell us the type of expression for that image. These images are pretty low resolution (about 60 x 60 pixels) so you can actually see the pixel boundaries here. It is informative to grab the corner of the figure window and shrink the figure window on your screen - surprisingly you should be able to see the face "more clearly" as it shrinks. The pixel boundaries distract our brain when they are visible, but when we shrink the figure (and can't be distracted by the pixel boundaries) our brain can recognize more structure in the image (typically more features of the  face).

Now we can display whole sets of images. The function dispset2d.m is a simple MATLAB function which takes an array structure of the form described above (e.g., "i2up") and displays a "mosaic" of the individual images on an image grid. If you call "dispset2d(i2straight(1,:));" you will display all of the images for the first person on the list. "dispset2d(i2straight(:,1));" will display the first expression for each person on the list.  Note also that this function lays out the mosaic column by column (i.e., columnwise rather than row-wise, in the mosaic). Note that there may be some blank images in the set: you will need to check that each image field contains data, i.e., that it is not just a null matrix.

You are to write a MATLAB function which will take a structure array of images (with exactly the same fields as above, e.g., "i2up"), find the k nearest neighbor images to a specified "query image", and display both the original image and the k nearest neighbor images if "plotflag" is 1 (the first image that shows up in the displayed result (top leftmost image in the mosaic), and on the list of indices, should be the query image itself). The query image is specified by input arguments i and j, namely the (i,j)th element of the structure array. If you wish you can provide some additional visual information to identify the query image (the first image displayed), e.g., by automatically drawing a red box around the image or some other visual cue.

The k nearest neighbor images are defined as the k images (including the query image itself) which are closest to this query image where "closeness" is measured by Euclidean distance. Euclidean distance between two matrix images is defined the same way as for two vectors, i.e., pixel by pixel differences squared and summed. One way to do this is to convert the images to vectors and then just call your code from Assignment 2 for finding the k closest vectors to a given one. Or you can take the differences of 2 images directly by subtracting the 2 matrices corresponding to the 2 images, and then squaring the resulting differences, summing them up and taking the square root of the sum. [Note, if you don't have working kNN code, or its too slow, feel free to use the code examples provided in slides in class - if that does not work, please feel free to email the TA or the instructor].

 Finally, you need to call dispset2d.m from within this function to create your display (or adapt or replicate the necessary code from within dispset2d.m). To get this to work you may need to experiment a bit with the function dispset2d.m.

function  [list] = knndispset(imageset,i,j,k,plotflag)
  % function  [list] = knndispset(imageset,i,j,k, plotflag)
  %
  %  a brief description of what the function does
  %  ......
  %                            Your Name, CS 175, date
  %
  %  Inputs
  %     imageset:  an array structure of images (CS 175 format)
  %     i, j:  integers specifying that imageset(i,j).image is the query image
  %     k: number of neighbors to find
  %    plotflag: display the k nearest neighbors if plotflag = 1;
  %
 %  Outputs
 %    list: a k x 2 matrix, where the first row contains the indices from imageset of the nearest neighbor,
%                     the second row contains the indices of the 2nd nearest neighbor, and so forth.

    ---------------- your  code goes here ------------------

To test this code, you can print out the indices (i,j) for each of the k neighbors that are found: for a query on individual i, you should find in almost all cases with this data that the 3 closest neighbors are also images from individual i.
 
 

Part 3: Comparing Classifiers using  Cross-Validation

In this part of the assignment we are giving you the code that you will need and you just need to run the code as describe below and generate a table of results.  As discussed in class, the training data accuracy need not be a good indicator of how a classifier will perform on new unseen data. To estimate this "generalization accuracy" we can use the technique of cross-validation as discussed in class. You will use a function called test_classifiers.m  that will calculate both the cross-validation accuracy and the training accuracy for each of (a) a minimum distance classifier, and (b) K different knn classifiers, each using a different value of k (default value is K = 1, with k = 1).

The minimum distance classifier was described in class and is called minimum_distance.m, and you can find out how it is called by typing "help minimum_distance" in MATLAB.

The code below calls the function knn.m as specified for Assignment 2, with the same arguments - so you need to have a working version of knn.m for the code below to run.


The header for the function test_classifiers is as follows:


function [cvacc, trainacc] = test_classifiers(data1,data2,kvalues,v,rseed)
% function test_classifiers(data1,data2,kvalues,v,rseed)
% 
.........
%
% INPUTS:
%   data1: n1 x d feature data for class 1
%   data2: n2 x d feature data for class 2
%   kvalues: a K x 1 vector of kvalues for knn
%   v: for "v-fold" cross-validation
%   rseed: state of random seed before permuting the data order
%      (this is useful for debugging since it allows us to repeat a given run exactly).
%
% OUTPUT:
%  cvacc: K+1 x 1 vector of accuracies estimated using cross-validation
%  trainacc: K+1 x 1 vector of accuracies on the training data
%    where:  (1) accuracy is expressed as a percentage, between 0 and 100%
%                (2) K is the number of different k values used for knn (i.e., length(kvalues))
%                (3) cvacc(i), trainacc(i), for i=1:K, is the accuracy for knn with k = kvalues(i)
%                (4) cvacc(K+1), trainacc(K+1), is the accuracy for the minimum distance classifier

As well as returning the specified results above, the code prints out the training accuracy and cross-validation accuracy for each example (to the screen) so that it is easy to see what is going on.

As a simple test of this code, if you use the data1 and data2 data sets (data from each of the 2 classes that are in the .mat file assignment4_simulated_data.mat), and call the function with k=1 nearest-neighbors you should get approximately the  following results (there may be differences in how the random number generator works on different machines, so your cross-validation partitions might be different to those used in the runs below, and consequentaly the accuraries reported may be a little different - but they should be roughly similar).
>> test_classifiers(data1,data2,1,10,1234);
Training Data Results:
Minimum distance accuracy = 74.00
KNN, k=1, accuracy = 100.00

Cross Validation Results (v=10):
Minimum distance accuracy = 74.00
KNN, k=1, accuracy = 67.00


If now try multiple different values for k, the results are as follows:

>> test_classifiers(data1,data2,[1 3 5 7 9 11 13 15 21 31 51],10,1234);
Training Data Results:
Minimum distance accuracy = 74.00
KNN, k=1, accuracy = 100.00
KNN, k=3, accuracy = 84.00
KNN, k=5, accuracy = 83.50
KNN, k=7, accuracy = 82.00
KNN, k=9, accuracy = 81.50
KNN, k=11, accuracy = 80.00
KNN, k=13, accuracy = 80.50
KNN, k=15, accuracy = 80.00
KNN, k=21, accuracy = 79.00
KNN, k=31, accuracy = 77.50
KNN, k=51, accuracy = 76.50

Cross Validation Results (v=10):
Minimum distance accuracy = 74.00
KNN, k=1, accuracy = 67.00
KNN, k=3, accuracy = 76.00
KNN, k=5, accuracy = 75.00
KNN, k=7, accuracy = 78.00
KNN, k=9, accuracy = 77.00
KNN, k=11, accuracy = 76.00
KNN, k=13, accuracy = 74.50
KNN, k=15, accuracy = 75.50
KNN, k=21, accuracy = 76.00
KNN, k=31, accuracy = 75.50
KNN, k=51, accuracy = 76.50

Note that the results will be sensitive to the random permutation of the data, i.e., different random permutations give different train/validation sets, and can give different accuracy estimates. Thus, if you want to get the same results each time you run the code, you can use the value rseed to reset the state of the pseudorandom number generator.

Important: you should verify that this code is working (producing numbers similar or the same as those above and running in reasonable time (certainly in less than a minute for the example with multiple k values above)) before you proceed to Part 4 below.
 

Part 4: Classifying  Images

You will find in the MATLAB file imagedata.mat the following sets of images: (1) rimages, an array structure with 20 images (e.g., rimages(1).image) of faces facing to the right, where each image has been shrunk from its original 120 x 128 size to 30 x 32 (this was done to save on computation time), (2) simages, same as rimages, but faces facing straight at the camera, (3) uimages, same as (1) but faces pointed upwards. You are to write a general function (that uses the functions above) that can take any 2 sets of image structures (e.g., the right set and the straight set) and compare the performance (training accuracy and cross-validation accuracy) of the minimum distance classifier and the k-nearest-neighbor classifier (as in Part 3, but you will now need to modify the test_classifiers.m function so that it can handle image structures as input). The function is defined below for you.

Note: some of the images are blank. You should not use these blank images for either training the classifier or testing the classifier.  I suggest that you simply remove blank images when you do the image to matrix conversion. Note that if you do *not* remove the blank images, the minimum distance classifier in particular will typically give much worse performance (the nearest neighbor classifier is not affected as much in general): can you think why this would be the case for each of these 2 classifiers? 

function [cvacc, trainacc] = test_imageclassifiers(imageset1,imageset2,plotflag,kvalues,v,rseed)
% function  [cvacc, trainacc] = test_imageclassifiers(imageset1,imageset2,plotflag,kvalues,v,rseed)
%
% Learns a classifier to classify images in imageset1
% from images in imageset2, using minimum distance and knn classifiers,
% and returns the training and cross-validation accuracies.
%
%                                                          Your name, CS 175, date
%
% INPUTS:
%   imageset1, imageset2: arrays (of size m x n, and m2 x n2)
%       of structures, where imageset1(i,j).image is a matrix of
%       pixel (image) values of size nx by ny. It is assumed
%       that all images are of the same size in both imageset1
%       and imageset2.
%   plotflag: if plotflag=1, plot the mean image for each set,
%   and plot the difference of the means of the images in the two sets.
%   kvalues: an K x 1 vector of k values for the knn classifier
%   v: number of "folds" for v-fold cross-validation
%
% OUTPUTS:
%  cvacc: K+1 x 1 vector of accuracies estimated using cross-validation
%  trainacc: K+1 x 1 vector of accuracies on the training data
%    where: (1) accuracy is expressed as a percentage, between 0 and 100%
%           (2) K is the number of different k values used for knn (i.e., length(kvalues))
%           (3) cvacc(i), trainacc(i), for i=1:K, is the accuracy for knn with k = kvalues(i)
%           (4) cvacc(K+1, trainacc(K+1), is the accuracy for the minimum distance classifier

% convert each imageset to a feature matrix form for classification learning
dvector1 = image_to_matrix(imageset1);
dvector2 = image_to_matrix(imageset2);

% now run the various classifiers and report the accuracy results
[cvacc, trainacc] = test_classifiers(dvector1,dvector2,kvalues,v,rseed);

%            ------- end of MATLAB code --------------------
 

To help you test this function, here is an example of the output produced by my function when called with these arguments.
>> test_imageclassifiers(rimages,simages,0,3,10,1234)

Training Data Results:
Minimum distance accuracy = 89.47
KNN, k=3, accuracy = 84.21

Cross Validation Results (v=10):
Minimum distance accuracy = 86.67
KNN, k=3, accuracy = 76.67

Note: due to variability with the way the random number generator and seeding works, the numbers you obtain may be slightly different, but should nonetheless be in the 70 and 80 percent ranges.

   


What to Turn In  (EEE submission by 9:30am on Tuesday)