3D from looking: using wearable gaze tracking for hands-free and feedback-free object modelling

This paper presents a method for estimating the 3D shape of an object being observed using wearable gaze tracking. Starting from a sparse environment map generated by a simultaneous localization and mapping algorithm (SLAM), we use the gaze direction positioned in 3D to extract the model of the object under observation. By letting the user look at the object of interest, and without any feedback, the method determines 3D point-of-regards by back-projecting the user's gaze rays into the map. The 3D point-of-regards are then used as seed points for segmenting the object from captured images and the calculated silhouettes are used to estimate the 3D shape of the object. We explore methods to remove outlier gaze points that result from the user saccading to non object points and methods for reducing the error in the shape estimation. Being able to exploit gaze information in this way, enables the user of wearable gaze trackers to be able to do things as complex as object modelling in a hands-free and even feedback-free manner.

[1]  M A Just,et al.  A theory of reading: from eye fixations to comprehension. , 1980, Psychological review.

[2]  Michael F. Land,et al.  Predictable eye-head coordination during driving , 1992, Nature.

[3]  Tom Drummond,et al.  ProFORMA: Probabilistic Feature-based On-line Rapid Model Acquisition , 2009, BMVC.

[4]  M. Land Eye movements and the control of actions in everyday life , 2006, Progress in Retinal and Eye Research.

[5]  Horst Bischof,et al.  Online 3D reconstruciton using Convex Optimization , 2011, ICCV 2011.

[6]  Robert J. K. Jacob,et al.  Eye tracking in human-computer interaction and usability research : Ready to deliver the promises , 2002 .

[7]  BullingAndreas,et al.  Eye Movement Analysis for Activity Recognition Using Electrooculography , 2011 .

[8]  V. Arshavsky,et al.  Progress in Retinal and Eye Research , 2008 .

[9]  Daniel Cremers,et al.  Real-Time Dense Geometry from a Handheld Camera , 2010, DAGM-Symposium.

[10]  Gerhard Tröster,et al.  Eye Movement Analysis for Activity Recognition Using Electrooculography , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[13]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[14]  Paul Debevec,et al.  Modeling and Rendering Architecture from Photographs , 1996, SIGGRAPH 1996.

[15]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[16]  Pished Bunnun,et al.  OutlinAR: an assisted interactive model building system with reduced computational effort , 2008, 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality.

[17]  Fillia Makedon,et al.  3D point of gaze estimation using head-mounted RGB-D cameras , 2012, ASSETS '12.

[18]  I. S. Mackenzie,et al.  Virtual Environments and Advanced Interface Design , 1995 .

[19]  Horst Bischof,et al.  Online 3D reconstruction using convex optimization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[20]  Bruce H. Thomas,et al.  Interactive augmented reality techniques for construction at a distance of 3D geometry , 2003, IPT/EGVE.

[21]  Tsukasa Ogasawara,et al.  Estimating 3D point-of-regard and visualizing gaze trajectories under natural head movements , 2010, ETRA '10.

[22]  Takeo Kanade,et al.  Mode-seeking by Medoidshifts , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  Tommy Strandvall,et al.  Eye Tracking in Human-Computer Interaction and Usability Research , 2009, INTERACT.

[24]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[25]  Jeff B. Pelz,et al.  3D point-of-regard, position and head orientation from a portable monocular video-based eye tracker , 2008, ETRA '08.

[26]  Robert J. K. Jacob,et al.  Eye tracking in advanced interface design , 1995 .

[27]  S. Süsstrunk,et al.  SLIC Superpixels ? , 2010 .

[28]  Bruce H. Thomas,et al.  Interactive augmented reality techniques for construction at a distance of 3D geometry , 2003 .

[29]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[30]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[32]  M. Land,et al.  The Roles of Vision and Eye Movements in the Control of Activities of Daily Living , 1998, Perception.

[33]  Joseph H. Goldberg,et al.  Identifying fixations and saccades in eye-tracking protocols , 2000, ETRA.

[34]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[35]  Jun Rekimoto,et al.  GazeCloud: A Thumbnail Extraction Method Using Gaze Log Data for Video Life-Log , 2012, 2012 16th International Symposium on Wearable Computers.

[36]  Anton van den Hengel,et al.  Interactive modelling for AR applications , 2010, 2010 IEEE International Symposium on Mixed and Augmented Reality.