Structure and Process: Learning of Visual Models and Construction Plans for Complex Objects

Supervising robotic assembly of multi-functional objects by means of a computer vision system requires components to identify assembly operations and to recognize feasible assemblies of single objects. Thus, the structure of complex objects as well as their construction processes are of interest. If the results of both components should be consistent there have to be common models providing knowledge about the intended application. However, if the assembly system should handle not only exactly specified tasks it is rather impossible to model every possible assembly or action explicitly. The fusion of a flexible dynamic model for assemblies and a monitor for the construction process enables reliable and efficient learning and supervision. As an example, the construction of objects by aggregating wooden toy pieces is used. The system also integrates a natural speech dialog module, which provides the overall communication strategy and additionally supports decisions in the case of ambiguities and uncertainty.

[1]  Richard H. Crawford,et al.  Assembly modelling by geometric constraint satisfaction , 1996, Comput. Aided Des..

[2]  Dinesh K. Pai,et al.  Programming contact tasks using a reality-based virtual environment integrated with vision , 1999, IEEE Trans. Robotics Autom..

[3]  Daniel E. Whitney,et al.  Design-specific approach to design for assembly (DFA) for complex mechanical assemblies , 1999, IEEE Trans. Robotics Autom..

[4]  Arthur C. Sanderson,et al.  AND/OR graph representation of assembly plans , 1986, IEEE Trans. Robotics Autom..

[5]  B. Parhami Voting algorithms , 1994 .

[6]  Ipke Wachsmuth,et al.  Virtual assembly with construction kits , 1998 .

[7]  Alex Waibel,et al.  Face locating and tracking for human-computer interaction , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[8]  Helge J. Ritter,et al.  A Hybrid Object Recognition Architecture , 1996, ICANN.

[9]  Franz Kummert,et al.  Modeling and recognition of assembled objects , 1998, IECON '98. Proceedings of the 24th Annual Conference of the IEEE Industrial Electronics Society (Cat. No.98CH36200).

[10]  Katsushi Ikeuchi,et al.  Toward an assembly plan from observation. I. Task recognition with polyhedral objects , 1994, IEEE Trans. Robotics Autom..

[11]  Daniel Schlüter,et al.  Using Markov random fields for contour-based grouping , 1997, Proceedings of International Conference on Image Processing.

[12]  E. Granum,et al.  Skin colour detection under changing lighting conditions , 1999 .

[13]  Katsushi Ikeuchi,et al.  Task-Oriented Generation of Visual Sensing Strategies in Assembly Tasks , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  A. Sherif El-Gizawy,et al.  Computer-aided monitoring system for flexible assembly operations , 1997 .

[15]  Dorin Comaniciu,et al.  Robust analysis of feature spaces: color image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Sven Wachsmuth,et al.  Integration of Vision and Speech Understanding Using Bayesian Networks , 2000 .

[17]  Samuel Pierre,et al.  An artificial intelligence approach for generating assembly sequences in CAD/CAM , 1996, Artif. Intell. Eng..

[18]  Helge J. Ritter,et al.  Integrating Recognition Paradigms in a Multiple-Path Architecture , 2001, ICAPR.

[19]  Olivier Faugeras,et al.  Three-Dimensional Computer Vision , 1993 .

[20]  Gang Wei,et al.  Face detection for image annotation , 1999, Pattern Recognition Letters.

[21]  Jan Wolter,et al.  A structure-oriented approach to assembly sequence planning , 1997, IEEE Trans. Robotics Autom..

[22]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[23]  Gernot A. Fink Developing HMM-Based Recognizers with ESMERALDA , 1999, TSD.

[24]  N. Nissanke,et al.  An algebra for modelling assembly tasks , 1996 .

[25]  Sven Wachsmuth,et al.  Multilevel Integration of Vision and Speech Understanding Using Bayesian Networks , 1999, ICVS.

[26]  J. Carlos Languages of the Mind , 1995 .

[27]  Pietro Perona,et al.  Bayesian reasoning on qualitative descriptions from images and speech , 2000, Image Vis. Comput..

[28]  Amitabha Mukerjee Neat versus scruffy: a review of computational models for spatial expressions , 1998 .

[29]  Horst-Michael Groß,et al.  Neural Architecture for Gesture-Based Human-Machine-Interaction , 1997, Gesture Workshop.

[30]  Linda G. Shapiro,et al.  Analysis of Scenes Containing Multiple Non-Polyhedral 3D Objects , 1995, ICIAP.

[31]  Jürgen Schürmann,et al.  Pattern classification , 2008 .

[32]  Horst Bunke,et al.  A New Algorithm for Error-Tolerant Subgraph Isomorphism Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[34]  O. Faugeras Three-dimensional computer vision: a geometric viewpoint , 1993 .

[35]  Sven Wachsmuth,et al.  Integrated Recognition and Interpretation of Speech for a Construction Task Domain , 1999, HCI.

[36]  Franz Kummert,et al.  Grammars and Discourse Theory to Describe and Recognize Mechanical Assemblies , 2000, SSPR/SPR.

[37]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[38]  Sven Wachsmuth,et al.  Using Speech in Visual Object Recognition , 2000, DAGM-Symposium.

[39]  Franz Kummert,et al.  Hybrid object recognition in image sequences , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[40]  Sven Wachsmuth,et al.  Integration of parsing and incremental speech recognition , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[41]  Frank Lömker,et al.  Erkennung von Konstruktionshandlungen aus Bildfolgen , 2000, DAGM-Symposium.

[42]  Michael J. Black,et al.  A Probabilistic Framework for Matching Temporal Trajectories: CONDENSATION-Based Recognition of Gestures and Expressions , 1998, ECCV.