VADS: Visual attention detection with a smartphone

Identifying the object that attracts human visual attention is an essential function for automatic services in smart environments. However, existing solutions can compute the gaze direction without providing the distance to the target. In addition, most of them rely on special devices or infrastructure support. This paper explores the possibility of using a smartphone to detect the visual attention of a user. By applying the proposed VADS system, acquiring the location of the intended object only requires one simple action: gazing at the intended object and holding up the smartphone so that the object as well as user's face can be simultaneously captured by the front and rear cameras. We extend the current advances of computer vision to develop efficient algorithms to obtain the distance between the camera and user, the user's gaze direction, and the object's direction from camera. The object's location can then be computed by solving a trigonometric problem. VADS has been prototyped on commercial off-the-shelf (COTS) devices. Extensive evaluation results show that VADS achieves low error (about 1.5° in angle and 0.15m in distance for objects within 12m) as well as short latency. We believe that VADS enables a large variety of applications in smart environments.

[1]  Nicu Sebe,et al.  Combining Head Pose and Eye Location Information for Gaze Estimation , 2012, IEEE Transactions on Image Processing.

[2]  Mo Li,et al.  Recitation: Rehearsing Wireless Packet Reception in Software , 2015, MobiCom.

[3]  Su-Ling Yeh,et al.  Look into my eyes and I will see you: Unconscious processing of human gaze , 2012, Consciousness and Cognition.

[4]  Andreas Uhl,et al.  Multi-stage Visible Wavelength and Near Infrared Iris Segmentation Framework , 2012, ICIAR.

[5]  Marios Savvides,et al.  Robust modified Active Shape Model for automatic facial landmark annotation of frontal faces , 2009, 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems.

[6]  Jiri Matas,et al.  Forward-Backward Error: Automatic Detection of Tracking Failures , 2010, 2010 20th International Conference on Pattern Recognition.

[7]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[8]  Timothy F. Cootes,et al.  Accurate Regression Procedures for Active Appearance Models , 2011, BMVC.

[9]  Pim Haselager,et al.  Why we may not find intentions in the brain , 2014, Neuropsychologia.

[10]  Yoichi Sato,et al.  Appearance-Based Gaze Estimation Using Visual Saliency , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Pan Hu,et al.  iShadow: design of a wearable, real-time mobile gaze tracker , 2014, MobiSys.

[12]  Takahiro Okabe,et al.  Head pose-free appearance-based gaze sensing via eye image synthesis , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[13]  Erhardt Barth,et al.  Accurate Eye Centre Localisation by Means of Gradients , 2011, VISAPP.

[14]  S. P. Mudur,et al.  Three-dimensional computer vision: a geometric viewpoint , 1993 .

[15]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[16]  J. Haynes Brain Reading: Decoding Mental States From Brain Activity In Humans , 2011 .

[17]  Qiang Ji,et al.  In the Eye of the Beholder: A Survey of Models for Eyes and Gaze , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[19]  Kaishun Wu,et al.  We Can Hear You with Wi-Fi! , 2014, IEEE Transactions on Mobile Computing.

[20]  Fred Nicolls,et al.  Locating Facial Features with an Extended Active Shape Model , 2008, ECCV.

[21]  Wei Wang,et al.  Keystroke Recognition Using WiFi Signals , 2015, MobiCom.

[22]  Minho Lee,et al.  Identification of human implicit visual search intention based on eye movement and pupillary analysis , 2014, User Modeling and User-Adapted Interaction.

[23]  Yunhao Liu,et al.  Towards Accurate Object Localization with Smartphones , 2014, IEEE Transactions on Parallel and Distributed Systems.

[24]  Justin Manweiler,et al.  Satellites in our pockets: an object positioning system using smartphones , 2012, MobiSys '12.

[25]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Wataru Ohyama,et al.  Detection of eyes by circular Hough transform and histogram of gradient , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[27]  Yunhao Liu,et al.  It starts with iGaze: visual attention driven networking with smart glasses , 2014, MobiCom.

[28]  Robert Turner,et al.  Recent applications of UHF‐MRI in the study of human brain function and structure: a review , 2016, NMR in biomedicine.

[29]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[30]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Theo Gevers,et al.  Accurate Eye Center Location through Invariant Isocentric Patterns , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.