Object detection has seen huge progress in recent years, much thanks to the heavily-engineered Histograms of Oriented Gradients (HOG) features. Can we go beyond gradients and do better than HOG? We provide an affirmative answer by proposing and investigating a sparse representation for object detection, Histograms of Sparse Codes (HSC). We compute sparse codes with dictionaries learned from data using K-SVD, and aggregate per-pixel sparse codes to form local histograms. We intentionally keep true to the sliding window framework (with mixtures and parts) and only change the underlying features. To keep training (and testing) efficient, we apply dimension reduction by computing SVD on learned models, and adopt supervised training where latent positions of roots and parts are given externally e.g. from a HOG-based detector. By learning and using local representations that are much more expressive than gradients, we demonstrate large improvements over the state of the art on the PASCAL benchmark for both root-only and part-based models.
X. Ren, D. Ramanan."Histograms of Sparse Codes for Object Detection" Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013.
|README||Description of contents.||2KB|
|cvpr13_detection_final.zip||Core HSC code||22MB|
|INRIA.zip||INRIA person dataset in jpeg format.||298MB|
|PASCAL.zip||Pretrained PASCAL2007 model and results.||242MB|