论文信息 - Bottom-up/top-down image parsing by attribute graph grammar

Bottom-up/top-down image parsing by attribute graph grammar

In this paper, we present an attribute graph grammar for image parsing on scenes with man-made objects, such as buildings, hallways, kitchens, and living moms. We choose one class of primitives - 3D planar rectangles projected on images and six graph grammar production rules. Each production rule not only expands a node into its components, but also includes a number of equations that constrain the attributes of a parent node and those of its children. Thus our graph grammar is context sensitive. The grammar rules are used recursively to produce a large number of objects and patterns in images and thus the whole graph grammar is a type of generative model. The inference algorithm integrates bottom-up rectangle detection which activates top-down prediction using the grammar rules. The final results are validated in a Bayesian framework. The output of the inference is a hierarchical parsing graph with objects, surfaces, rectangles, and their spatial relations. In the inference, the acceptance of a grammar rule means recognition of an object, and actions are taken to pass the attributes between a node and its parent through the constraint equations associated with this production rule. When an attribute is passed from a child node to a parent node, it is called bottom-up, and the opposite is called top-down

Feng Han | Song-Chun Zhu | Song-Chun Zhu | Feng Han

[1] King-Sun Fu,et al. Syntactic Pattern Recognition And Applications , 1968 .

[2] King-Sun Fu,et al. A Syntactic Approach to Shape Recognition Using Attributed Grammars , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[3] Robert C. Bolles,et al. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[4] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[5] Stephan Baumann. A simplified attributed graph grammar for high-level music recognition , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[6] Andy Schürr,et al. Defining and Parsing Visual Languages with Layered Graph Grammars , 1997, J. Vis. Lang. Comput..

[7] Rong Zhang,et al. Integrating bottom-up/top-down for object recognition by data driven Markov chain Monte Carlo , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8] Song-Chun Zhu,et al. Towards a mathematical theory of primal sketch and sketchability , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9] Jana Kosecka,et al. Extraction, matching and pose recovery based on dominant rectangular structures , 2003, HLK.

[10] Zhuowen Tu,et al. Image Parsing: Unifying Segmentation, Detection, and Recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11] Kun Huang,et al. Symmetry-based photo editing , 2003, First IEEE International Workshop on Higher-Level Knowledge in 3D Modeling and Motion Analysis, 2003. HLK 2003..

[12] Zhuowen Tu,et al. Image Parsing: Unifying Segmentation, Detection, and Recognition , 2005, International Journal of Computer Vision.