HW4
Due: 5/18 12:30pm EEE Dropbox
Globally optimal tracking with dynamic programming
In this assignment, you will implement a globally-optimal tracker using dynamic programming (DP). You will build off of the code you developed in HW3. You will use the "face" images from the project video set. You will only be using the first 150 images of this sequence for this assignment. You implement efficient tracking algorithms using dynamic programming. You will explore different constraints, such as fixing the first and last frame. You will also explore strategies that iterate between learning a template and estimating a track.
Overview: You will be given skeleton code here. The high level script "hw4.m" is a wrapper similar to the one from hw3.m. You will need to implement the following basic functions to run this wrapper. You will implement a fixed-scale tracker. The tracker will use a bounded-velocity motion model - e.g., a track will be allowed to shift at most dx/dy pixels between frames. Larger shifts will incur greater costs when scoring a track. Formally, the velocity_cost(dx,dy) = velocity_cost(dx) + velocity_cost(dy). The questions will ask you to explore additional extensions.
User interaction: The wrapper script requires a user to draw a rectangle in the first frame using matlab's "getrect" function. I have included a rectangle saved as a matlab matfile. For reference, a rectangle will be encoded as a 4-element array [x1 y1 x2 y2] capturing the topleft and bottom right corner.
Helper functions
- showTrack.m [10 pts]
This function will display a movie of a track given a list of images and a list of rectangles. This function should call "showBox.m", written in the previous assignment.
Basic DP tracker
- DPTrack.m [30 pts]
This main function will estimate an optimal track using dynamic programming, given a list of images, a template, and a bounded-velocity motion model. This function will use "NCCIm.m" to compute local scores (implemented in the last homework). It will call "localmin.m" during a forward pass from the first to the last frame, iteratively estimating the cost of the best track. After reaching the last frame, this function will call "backTrack.m" to compute the best track using pointers to previous frames.
- localmin.m [20 pts]
This function computes the best matching rectangle in the previous frame, for every possible rectangle in the current frame. To do so, it needs to be passed in the costs of the rectangles in the previous frame, as well as the motion model. It returns both the cost of the best matching rectangle, and a pointer to that rectangle. Altough this function can be implemented with fancy MATLAB tricks to avoid for-loops, I recommend using for-loops for clarity here.
- backTrack.m [20 pts]
This function implements the backtracking step of dynamic programming. Given a collection of pointers for every rectangle in every frame, and the costs the rectangle in the final frame, this function returns the best track.
What to hand in: Hand in all the completed functions above, complete with comments. Also hand in figures illustrating the behaviour of various extensions of the basic tracker, as specified below. For a given strategy, show three illustrative frames - one frame of sucessful tracking, the frame where the tracker starts to fail, and one frame where the tracker has completely lost track. Also specify the frame at which the tracker lost track - this "time-to-failure" statistic is a common way to evaluate trackers. Note that the following extensions are not trivial, and require significantly more work beyond implementing the basic tracker above.
Q1. Implement the basic DP tracker. Experiment with different bounded-velocity motion models - e.g., using squared velocity, absolute velocity, using different costs for shifts in the x and y directions. Also experiment with different bounds on the velocity - allow the track to shift at most 0,5,10 pixels between frames. Which achieves the best performance? Show three example frames, and the time-to-failure statistic as specified above [20 pts answer to question]
Q2 (Efficient tracking). The tracker you have implemented will be relatively slow - roughly O(TNKK), where T is the number of frames, N is the number of pixels in the image, and K is maximum allow velocity in the X and Y directions. Implement a "localmin_fast.m" that exploits the fact that velocity cost is separable along the x and y directions (eg, velocity_cost(dx,dy) = velocity_cost(dx) + velocity_cost(dy)). This should reduce computation to O(TNK), making the tracker noticably faster. [20 pts code]
Q3 (Model-estimation and tracking) Given the optimal track, learn a new template by averaging patches extracted from the tracked rectangles together. Re-track with the new averaged template. Does this help (eg, show three frames and the time to failure) [10 pts code and 10 pts answer to question]
Q4 (Contrained tracking). Use "getrect.m" to obtain the true location of the rectangle in the last frame. Implement the constraint that the location of the rectangle in the first and last frame are fixed. Find the optimal track (using DP) subject to these constraints. Does this help? Show example frames where errors from the basic tracker are fixed, and specify the time-to-failure if the tracker fails. [10 pts code and 10 pts answer to question]
Q5 [EXTRA-CREDIT]. Implement a O(TN) tracker using a motion model that penalizes displacements using L1-distance. To do so, you will need to implement the 2D distance-transform algorithm for L1 distance. Tune the scale factors for the x and y displacement to achieve the best results. Show example frames and specify the time-to-failure. [10 points code and 10 points answer to question]
Hints
- I suggest working with half-resized images, so as to speed up the overall tracking code.
- For debugging purposes, experiment with the first 5 frames of the sequence.
- The writeup requires lots of plots. Use "subplot.m" to generate figures with multiple plots.