Articulated Pose Estimation with Flexible Mixtures of Parts

We describe a method for detecting articulated people and estimating their pose from static images based on a new representation of deformable part models. Rather than modeling articulation using a family of warped (rotated and foreshortened) templates, we use a mixture of small, non-oriented parts. We describe a general, flexible mixture model that jointly captures spatial relations between part locations and co-occurrence relations between part mixtures, augmenting standard pictorial structure models that encode just spatial relations. Our models have several notable properties: (1) they efficiently model articulation by sharing computation across similar warps (2) they efficiently model an exponentially-large set of global mixtures through composition of local mixtures and (3) they capture the dependence of global geometry on local appearance (parts can look different at different spatial locations). We learn all parameters, including local appearances, spatial relations, and co-occurrence relations (which encode local rigidity) with a structured SVM solver. We introduce novel criteria for evaluating articulated human detection and pose estimation, both separately and jointly. We present experimental results on standard benchmarks that suggest our approach is the state-of-the-art system for pose estimation, improving past work on the challenging Parse and Buffy datasets, while being orders of magnitude faster.

Y. Yang, D. Ramanan. "Articulated Human Detection with Flexible Mixtures of Parts" IEEE Pattern Analysis and Machine Intelligence (PAMI). To appear.

Y. Yang, D. Ramanan. "Articulated Pose Estimation using Flexible Mixtures of Parts" Computer Vision and Pattern Recognition (CVPR) Colorado Springs, Colorado, June 2011.


README Description of contents. 2KB
pose-release1.2-basic.tgz Basic code for detection and pose estimation with pre-trained full-body and upper-body models. 1MB
pose-release1.2-full.tgz Full code for training and testing, including BUFFY, PARSE, and INRIA image benchmarks. 89MB
pose-release1.3-full.tgz Full code for training and testing, including image benchmarks and new evaluation routines. 89MB