Localization based object recognition for smart home environments

In this paper we present a novel approach to object recognition based on image localization and registration applied to the problem of multi-modal interaction in smart-home environments. Typically such environments contain multiple small devices which need to be controlled from a distance. Thus, a major problem in recognizing a specific object is its small size in the image compounded by typically cluttered backgrounds. We therefore resort to recognizing an intended object by first registering the acquired image within the panorama. An environment map is used to recognize potential objects within the userpsilas field of view. Experimental results of using such a multi-modal input system on a running smart home environment are presented, where the benefits of combining visual and verbal inputs are demonstrated.

[1]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[2]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Sharon L. Oviatt,et al.  Ten myths of multimodal interaction , 1999, Commun. ACM.

[5]  Andrew D. Wilson,et al.  Gesture recognition using the XWand , 2004 .

[6]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[7]  Sebastian Möller,et al.  INSPIRE: Evaluation of a Smart-Home System for Infotainment Management and Device Control , 2004, LREC.

[8]  Shree K. Nayar,et al.  Catadioptric omnidirectional camera , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.