Evaluating information contributions of bottom-up and top-down processes

This paper presents a method to quantitatively evaluate information contributions of individual bottom-up and top-down computing processes in object recognition. Our objective is to start a discovery on how to schedule bottom-up and top-down processes. (1) We identify two bottom-up processes and one top-down process in hierarchical models, termed α, β and γ channels respectively ; (2) We formulate the three channels under an unified Bayesian framework; (3) We use a blocking control strategy to isolate the three channels to separately train them and individually measure their information contributions in typical recognition tasks; (4) Based on the evaluated results, we integrate the three channels to detect objects with performance improvements obtained. Our experiments are performed in both low-middle level tasks, such as detecting edges/bars and junctions, and high level tasks, such as detecting human faces and cars, together with a group of human study designed to compare computer and human perception.

[1]  Song-Chun Zhu,et al.  Mapping Natural Image Patches by Explicit and Implicit Manifolds , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[3]  Gui-Song Xia,et al.  Compositional Boosting for Computing Hierarchical Image Structures , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Long Zhu,et al.  A Hierarchical Compositional System for Rapid Object Detection , 2005, NIPS.

[5]  Benjamin Z. Yao,et al.  Introduction to a Large-Scale General Purpose Ground Truth Database: Methodology, Annotation Tool and Benchmarks , 2007, EMMCVPR.

[6]  S. Ullman Visual routines , 1984, Cognition.

[7]  Song-Chun Zhu,et al.  Deformable Template As Active Basis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Pawan Sinha,et al.  Face Recognition by Humans: Nineteen Results All Computer Vision Researchers Should Know About , 2006, Proceedings of the IEEE.

[9]  D. Geman,et al.  Hierarchical testing designs for pattern recognition , 2005, math/0507421.

[10]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[11]  Thomas Serre,et al.  A Component-based Framework for Face Detection and Identification , 2007, International Journal of Computer Vision.

[12]  Harry Shum,et al.  Image segmentation by data driven Markov chain Monte Carlo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[13]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[15]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[16]  Michel Vidal-Naquet,et al.  Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[17]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[18]  Alan L. Yuille,et al.  Feature extraction from faces using deformable templates , 2004, International Journal of Computer Vision.