CellMate : A Responsive and Accurate Vision-based Appliance Identification System

Identifying and interacting with smart appliances has been challenging in the burgeoning smart building era. Existing identification methods require either cumbersome query statements or the deployment of additional infrastructure. There is no platform that abstracts sophisticated computer vision technologies to provide an easy visual identification interface, which is the most intuitive way for a human. We introduce CellMate, an responsive and accurate visionbased appliance identification system using smartphone cameras. We innovate on optimizing and combining the advantages of several of the latest computer vision technologies based on our unique constraints of accuracy, latency, and scalability. To evaluate CellMate, we collected 4008 images from 39 room-size areas across five campus buildings, making the size one order of magnitude greater than prior work. We also collected 1526 human-labeled images and tested them on different groups of areas. With existing indoor localization technologies, we can easily narrow down the location to ten areas and achieve a success rate of more than 96% within less than 60 ms server processing time. We optimized average local network latency to 84 ms and therefore expect around 144 ms total identification time on the smartphone end.

[1]  Robert B. Miller,et al.  Response time in man-computer conversational transactions , 1899, AFIPS Fall Joint Computing Conference.

[2]  Long Quan,et al.  Linear N-Point Camera Pose Determination , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jun Rekimoto,et al.  CyberCode: designing augmented reality environments with visual tags , 2000, DARE '00.

[4]  Touradj Ebrahimi,et al.  JPEG 2000 performance evaluation and assessment , 2002, Signal Process. Image Commun..

[5]  Gregory D. Abowd,et al.  A 2-Way Laser-Assisted Selection Scheme for Handhelds in a Physical Environment , 2003, UbiComp.

[6]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[7]  Zhe Xu,et al.  A point-and-click interface for the real world: Laser designation of objects for mobile manipulation , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[8]  Gerhard Tröster,et al.  Wearable EOG goggles: Seamless sensing and context-awareness in everyday environments , 2009, J. Ambient Intell. Smart Environ..

[9]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[10]  Chia-Nian Shyi,et al.  Design and implementation of augmented reality system collaborating with QR code , 2010, 2010 International Computer Symposium (ICS2010).

[11]  David E. Culler,et al.  sMAP: a simple measurement and actuation profile for physical information , 2010, SenSys '10.

[12]  Peter Lambert,et al.  Ultra High Definition video decoding with Motion JPEG XR using the GPU , 2011, 2011 18th IEEE International Conference on Image Processing.

[13]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[14]  F. Michaud,et al.  Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation , 2013, IEEE Transactions on Robotics.

[15]  Pattie Maes,et al.  Smarter objects: using AR technology to program physical objects and their interactions , 2013, CHI Extended Abstracts.

[16]  Simon Mayer,et al.  Device recognition for intuitive interaction with the web of things , 2013, UbiComp.

[17]  Edward A. Lee,et al.  HOBS: head orientation-based selection in physical spaces , 2014, SUI.

[18]  Yu Xiao,et al.  iMoon: Using Smartphones for Image-based Indoor Navigation , 2015, SenSys.

[19]  Justin Manweiler,et al.  OverLay: Practical Mobile Augmented Reality , 2015, MobiSys.

[20]  Jie Liu,et al.  A realistic evaluation and comparison of indoor location technologies: experiences and lessons learned , 2015, IPSN.

[21]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.