Object Recognition for the Internet of Things

We present a system which allows to request information on physical objects by taking a picture of them. This way, using a mobile phone with integrated camera, users can interact with objects or "things" in a very simple manner. A further advantage is that the objects themselves don't have to be tagged with any kind of markers. At the core of our system lies an object recognition method, which identifies an object from a query image through multiple recognition stages, including local visual features, global geometry, and optionally also metadata such as GPS location. We present two applications for our system, namely a slide tagging application for presentation screens in smart meeting rooms and a cityguide on a mobile phone. Both systems are fully functional, including an application on the mobile phone, which allows simplest point-and-shoot interaction with objects. Experiments evaluate the performance of our approach in both application scenarios and show good recognition results under challenging conditions.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  L. Gool,et al.  Interactive museum guide : fast and robust recognition of museum objects , 2006 .

[4]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[6]  Antti Oulasvirta,et al.  Computer Vision – ECCV 2006 , 2006, Lecture Notes in Computer Science.

[7]  Gregory D. Abowd,et al.  Classroom 2000: An Experiment with the Instrumentation of a Living Educational Environment , 1999, IBM Syst. J..

[8]  Lucas Paletta,et al.  A Mobile Vision Service for Multimedia Tourist Applications in Urban Environments , 2006, 2006 IEEE Intelligent Transportation Systems Conference.

[9]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[10]  W. Niblack SlideFinder: a tool for browsing presentation graphics using content-based retrieval , 1999, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL'99).

[11]  Horst Bischof,et al.  Efficient Maximally Stable Extremal Region (MSER) Tracking , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Jean-Marc Odobez,et al.  Application of Information Retrieval Technologies to Presentation Slides , 2006, IEEE Transactions on Multimedia.

[13]  Michael Rohs,et al.  USING CAMERA-EQUIPPED MOBILE PHONES FOR INTERACTING WITH REAL-WORLD OBJECTS , 2004 .

[14]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  T. Fuhrmann,et al.  Using Bluetooth for Informationally Enhanced Environments , .

[16]  Oliver Bimber,et al.  PhoneGuide: museum guidance supported by on-device object recognition on mobile phones , 2005, MUM '05.

[17]  Andrei Popescu-Belis,et al.  Machine Learning for Multimodal Interaction , 4th International Workshop, MLMI 2007, Brno, Czech Republic, June 28-30, 2007, Revised Selected Papers , 2008, MLMI.

[18]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[19]  Arnon Amir,et al.  Towards automatic real time preparation of on-line video proceedings for conference talks and presentations , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[20]  Jean Carletta,et al.  The AMI Meeting Corpus: A Pre-announcement , 2005, MLMI.

[21]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Andreas Butz,et al.  Shoot & copy: phonecam-based information transfer from public displays onto mobile phones , 2007, Mobility '07.

[23]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[24]  Roy Want,et al.  RFID. A key to automating everything. , 2004, Scientific American.

[25]  Marc Langheinrich,et al.  Toolkit for Bar Code Recognition and Resolving on Camera Phones - Jump Starting the Internet of Things , 2006, GI Jahrestagung.