Combinatorial Inference and Learning for Fusing Recognition and Perceptual Grouping

The overarching goal of this NSF funded project (IIS-1253538) is to develop integrated models for fusing recognition and perceptual grouping.

When presented with a novel image, humans typically have little problem providing a consistent interpretation of the scene in terms of contours, surfaces, junctions, and the relations between them. This process of perceptual organization is closely coupled with recognition of familiar shapes and materials. Perceptual organization can aid recognition by reducing the complexity of a cluttered scene to a small number of candidate surfaces while recognition can help resolve ambiguities in grouping based on local image cues. This project is developing a computational framework that fuses top-down information provided by recognition with bottom-up perceptual organization in order to automatically produce a coherent scene interpretation. This research includes (1) identifying local image features that provide cues to grouping and figure-ground, (2) developing libraries of composable detectors that capture the appearance of objects, parts and their spatial relations, and (3) designing models and efficient inference routines that explicitly reason about occlusion and the binding of image regions and contours into object shapes.

Integrated models of grouping and recognition have direct significance to expand the computer vision capabilities of robotics and assistive technologies that must operate in complex, cluttered environments. The framework being developed also has applications in automating biological image analysis where top-down shape information are useful in resolving noisy local measurements.


S. Kong, C. Fowlkes, "Low-rank Bilinear Pooling for Fine-Grained Classification", to appear, CVPR, (July 2017). arXiv:1611.05109 [pdf]

S. Wang, S. Wolf, C. Fowlkes, J. Yarkony, "Tracking Objects with Higher Order Interactions using Delayed Column Generation", to appear, AISTATS, (April 2017). arXiv:1512.02413 [pdf]

S. Wang, C. Fowlkes, "Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions", IJCV, to appear, DOI 10.1007/s11263-016-0960-z arXiv:1610.01394 [pdf]

G. Ghiasi, C. Fowlkes, "Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation", ECCV, Amsterdam, (Oct. 2016). arXiv:1605.02264 [pdf] [code]

P. Nguyen, G. Rogez, C. Fowlkes, D. Ramanan, "The Open World of Micro-Videos", Technical Report, March 2016 arXiv:1603.09439 [pdf]

R. Diaz, M. Lee, J. Schubert, C. Fowlkes, "Lifting GIS Maps into Strong Geometric Context" WACV 2016 arXiv:1507.03698 [pdf]

J. Yarkony, C. Fowlkes, "Planar Ultrametrics for Image Segmentation", Proc. of , NIPS, Dec. 2015. arXiv:1507.02407 [pdf]

S. Wang, C. Fowlkes, "Learning Optimal Parameters for Multi-target Tracking", BMVC 2015 [pdf]

G. Ghiasi, C. Fowlkes, "Using segmentation to predict the absence of occluded parts", BMVC 2015. [pdf] [data]

G. Ghiasi, C. Fowlkes, "Occlusion Coherence: Detecting and Localizing Occluded Faces", Technical Report, June 2015 arXiv:1506.08347 [pdf] [code] [dataset]

X. Zhu, C. Vondrick, C. Fowlkes, D. Ramanan, "Do we need more training data?", IJCV, DOI 10.1007/s11263-015-0812-2, March 2015 arXiv:1503.01508 [pdf]

S. Hallman, C. Fowlkes, "Oriented Edge Forests for Boundary Detection", CVPR, Boston, MA, (June 2015).
arXiv:1412.2066 [pdf] [code]

S. Wang, C. Fowlkes, "Learning Multi-target Tracking with Quadratic Object Interactions", Technical report, arXiv:1412.2066 (Dec. 2014) [pdf]

G. Ghiasi, C. Fowlkes, "Occlusion Coherence: Localizing Occluded Faces with a Hierarchcial Deformable Part Model", CVPR, Columbus, OH, (June 2014). [pdf]

G. Ghiasi, Y. Yang, D. Ramanan, C. Fowlkes, "Parsing Occluded People", CVPR, Columbus, OH, (June 2014). [pdf]

R. Díaz, S. Hallman, C. Fowlkes, "Detecting Dynamic Objects with Multi-View Background Subtraction", ICCV, Sydney, Austrailia (December 2013). [pdf]

R. Díaz, S. Hallman, C. Fowlkes, "Multi-View Background Subtraction for Object Detection", Scene Understanding Workshop, Portland, OR, (June 2013).

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.