Multi-modal Sequential Monte Carlo for On-Line Hierarchical Graph Structure Estimation in Model-based Scene Interpretation

We present a computationally efficient, on-line graph structure estimation method for model-based scene interpretation. Different scenes have different hierarchical graphical models composed of place, objects, and parts. Generally, it is very difficult and time-consuming to estimate dynamic graph structures. The key idea is to represent hypothesized graph structures as multi-modal particles instead of joint particle representation. Such Monte Carlo representation makes the one-line hierarchical graph structure estimation feasible. The proposed method is supported by the neurobiological inference model. Large-scale experimental results in an indoor (12 places, 112 3D objects) validate the feasibility of the proposed inference method

[1]  Michael C. Nechyba,et al.  Interpretation of Complex Scenes Using Generative Dynamic-Structure Models , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[3]  Stan Z. Li,et al.  Markov Random Field Modeling in Image Analysis , 2001, Computer Science Workbench.

[4]  Feng Han,et al.  Bottom-up/top-down image parsing by attribute graph grammar , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Zhuowen Tu,et al.  Image Parsing: Unifying Segmentation, Detection, and Recognition , 2005, International Journal of Computer Vision.

[6]  A. Doucet,et al.  Sequential Monte Carlo methods for multitarget filtering with random finite sets , 2005, IEEE Transactions on Aerospace and Electronic Systems.

[7]  In-So Kweon,et al.  Scene Interpretation: Unified Modeling of Visual Context by Particle-Based Belief Propagation in Hierarchical Graphical Model , 2006, ACCV.

[8]  M. Bar Visual objects in context , 2004, Nature Reviews Neuroscience.

[9]  Neil J. Gordon,et al.  Editors: Sequential Monte Carlo Methods in Practice , 2001 .

[10]  Christopher K. I. Williams,et al.  DTs: Dynamic Trees , 1998, NIPS.

[11]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[13]  Timothy J. Robinson,et al.  Sequential Monte Carlo Methods in Practice , 2003 .

[14]  Patrick Pérez,et al.  Maintaining multimodality through mixture tracking , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.