Learning 3D-shape perception with local linear maps

The authors consider the task of learning to extract 3D shape information about complex objects from monocular gray level pixel images. It is shown that this task can be efficiently solved by a network architecture of local linear maps. Very little preprocessing is necessary. No prior identification of salient object features or their image coordinates is required. The approach was demonstrated by training a network to identify the posture of a simulated robot hand with 10 joints from its image. Results are presented that show how the achieved accuracy depended on network size and the number of available training examples. Experiments are also reported on combining several networks. The robustness of the recognition process is discussed.<<ETX>>