Redundant Multi-Modal Integration of Machine Vision and Programmable Mechanical Manipulation for Scene Segmentation

The main idea in this paper is that one cannot discern the part-whole relationship of three-dimensional objects in a passive mode without a great deal of a priori information. Perceptual activity is exploratory, probing and searching. Physical scene segmentation is the first step in active perception. The task of perception is greatly simplified if one has to deal with only one object at a time. This work adapts the nondeterministic Turing machine model and develops strategies to control the interaction between sensors and actions for physical segmentation. Scene segmentation is formulated in graph theoretic terms as a graph generation/decomposition problem. The isomorphism between manipulation actions and graph decomposition operations is defined. The non-contact sensors generate the directed graphs representing the spatial relations among surface regions. The manipulator decomposes these graphs under contact sensor supervision. Assuming a finite number of sensors and actions and a goal state, that is reachable and measurable with the available sensors, the control strategies converge. This was experimentally verified in a real, noisy, and dynamic environment. Comments University of Pennsylvania Department of Computer and Information Science Technical Report No. MSCIS-88-41. This technical report is available at ScholarlyCommons: http://repository.upenn.edu/cis_reports/596 REDUNDANT MULTI-MODAL INTEGRATION OF MACHINE VISION AND PROGRAMMABLE MECHANICAL MINPULATION FOR SCENE SEGMENTATION Constantine J. Tsikos Ruzena K. Bajcsy MS-CIS-88-41 GRASP LAB 144 Department of Computer and Information Science School of Engineering and Applied Science University of Pennsylvania Philadelphia, PA 191 04