Graph-based joint clustering of fixations and visual entities

We present a method that extracts groups of fixations and image regions for the purpose of gaze analysis and image understanding. Since the attentional relationship between visual entities conveys rich information, automatically determining the relationship provides us a semantic representation of images. We show that, by jointly clustering human gaze and visual entities, it is possible to build meaningful and comprehensive metadata that offer an interpretation about how people see images. To achieve this, we developed a clustering method that uses a joint graph structure between fixation points and over-segmented image regions to ensure a cross-domain smoothness constraint. We show that the proposed clustering method achieves better performance in relating attention to visual entities in comparison with standard clustering techniques.

[1]  Douglas DeCarlo,et al.  Robust clustering of eye movement recordings for quantification of visual interest , 2004, ETRA.

[2]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[3]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[4]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[5]  Nisheeth K. Vishnoi,et al.  Biased normalized cuts , 2011, CVPR 2011.

[6]  M. Cugmas,et al.  On comparing partitions , 2015 .

[7]  Kurt Keutzer,et al.  Efficient, high-quality image contour detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Subramanian Ramanathan,et al.  Can computers learn from humans to see better?: inferring scene semantics from viewers' eye movements , 2011, ACM Multimedia.

[9]  Kenneth Holmqvist,et al.  Eye tracking: a comprehensive guide to methods and measures , 2011 .

[10]  Fernand Meyer,et al.  Topographic distance and watershed lines , 1994, Signal Process..

[11]  Pablo Andrés Arbeláez,et al.  Boundary Extraction in Natural Images Using Ultrametric Contour Maps , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[12]  Ralf Engbert Microsaccades: A microcosm for research on oculomotor control, attention, and visual perception. , 2006, Progress in brain research.

[13]  Malcolm Slaney,et al.  Web-Scale Multimedia Analysis: Does Content Matter? , 2011, IEEE MultiMedia.

[14]  Anton Osokin,et al.  Fast Approximate Energy Minimization with Label Costs , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Jitendra Malik,et al.  Using contours to detect and localize junctions in natural images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Sébastien Miellet,et al.  iMap: a novel method for statistical fixation mapping of eye movement data , 2011, Behavior research methods.

[17]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Harish Katti,et al.  An Eye Fixation Database for Saliency Detection in Images , 2010, ECCV.

[20]  MatsushitaYasuyuki,et al.  Graph-based joint clustering of fixations and visual entities , 2013, TAP 2013.

[21]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[22]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[24]  Qiang Ji,et al.  In the Eye of the Beholder: A Survey of Models for Eyes and Gaze , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  J. Henderson,et al.  Object-based attentional selection in scene viewing. , 2010, Journal of vision.

[26]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[27]  Loong Fah Cheong,et al.  Active segmentation with fixation , 2009, 2009 IEEE 12th International Conference on Computer Vision.