Self-Supervised Learning for Visual Tracking and Recognition of Human Hand

In Proc. of AAAI’2000 pp.243-248, Austin, Texas, July, 2000 Due to the large variation and richness of visual inputs, statistical learning gets more and more concerned in the practice of visual processing such as visual tracking and recognition. Statistical models can be trained from a large set of training data. However, in many cases, since it is not trivial to obtain a large labeled and representative training data set, it would be difficult to obtain a satisfactory generalization. Another difficulty is how to automatically select good features for representation. By combining both labeled and unlabeled training data, this paper proposes a new learning paradigm, selfsupervised learning, to investigate the issues of learning bootstrapping and model transduction. Inductive learning and transductive learning are the two main cases of self-supervised learning, in which the proposed algorithm, Discriminant-EM (D-EM), is a specific learning technique. Vision-based gesture interface is employed as a testbed in our research.

[1]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[2]  Ayhan Demiriz,et al.  Semi-Supervised Support Vector Machines , 1998, NIPS.

[3]  Alexander Gammerman,et al.  Learning by Transduction , 1998, UAI.

[4]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[5]  Jochen Triesch,et al.  Robust classification of hand postures against complex backgrounds , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[6]  Francis K. H. Quek,et al.  Inductive learning in hand pose recognition , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[7]  John R. Kender,et al.  Finding skin in color images , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[8]  Thomas S. Huang,et al.  Vision based hand modeling and tracking for virtual teleconferencing and telecollaboration , 1995, Proceedings of IEEE International Conference on Computer Vision.

[9]  Ying Wu,et al.  An Adaptive Self-Organizing Color Segmentation Algorithm with Application to Robust Real-time Human Hand Localization , 2000 .

[10]  Tosiyasu L. Kunii,et al.  Model-based analysis of hand posture , 1995, IEEE Computer Graphics and Applications.

[11]  Ying Wu,et al.  Capturing articulated human hand motion: a divide-and-conquer approach , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[12]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[13]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[14]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[15]  Shaogang Gong,et al.  Colour Model Selection and Adaption in Dynamic Scenes , 1998, ECCV.

[16]  ThrunSebastian,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000 .

[17]  Yuntao Cui,et al.  Hand sign recognition from intensity image sequences with complex backgrounds , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[18]  Ronen Basri,et al.  Clustering appearances of 3D objects , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[19]  Mubarak Shah,et al.  Visual gesture recognition , 1994 .

[20]  David C. Hogg,et al.  Towards 3D hand tracking using a deformable model , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.