A classification method of cooking operations based on eye movement patterns

We are developing a cooking support system that coaches beginners. In this work, we focus on eye movement patterns while cooking meals because gaze dynamics include important information for understanding human behavior. The system first needs to classify typical cooking operations. In this paper, we propose a gaze-based classification method and evaluate whether or not the eye movement patterns have a potential to classify the cooking operations. We improve the conventional N-gram model of eye movement patterns, which was designed to be applied for recognition of office work. Conventionally, only relative movement from the previous frame was used as a feature. However, since in cooking, users pay attention to cooking ingredients and equipments, we consider fixation as a component of the N-gram. We also consider eye blinks, which is related to the cognitive state. Compared to the conventional method, instead of focusing on statistical features, we consider the ordinal relations of fixation, blink, and the relative movement. The proposed method estimates the likelihood of the cooking operations by Support Vector Regression (SVR) using frequency histograms of N-grams as explanatory variables.

[1]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[2]  Atsushi Hashimoto,et al.  Clustering scenes in cooking video guided by object access , 2015, 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[3]  M. Hayhoe,et al.  In what ways do eye movements contribute to everyday activities? , 2001, Vision Research.

[4]  R. Schleicher,et al.  Blinks and saccades as indicators of fatigue in sleepiness warners: looking tired? , 2022 .

[5]  Andreas Bulling,et al.  Discovery of everyday human activities from long-term visual behaviour using topic models , 2015, UbiComp.

[6]  Gerhard Tröster,et al.  Eye Movement Analysis for Activity Recognition Using Electrooculography , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  D. E. Irwin Fixation location and fixation duration as indices of cognitive processing , 2004 .

[8]  Yoichi Sato,et al.  Coupling eye-motion and ego-motion features for first-person activity recognition , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[9]  Gerhard Tröster,et al.  Eye Movement Analysis for Activity Recognition Using Electrooculography , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Hiroshi Murase,et al.  Video CooKing: Towards the Synthesis of Multimedia Cooking Recipes , 2011, MMM.

[11]  Hiroshi Murase,et al.  Automatic authoring of a domestic cooking video based on the description of cooking instructions , 2013, CEA '13.

[12]  James M. Rehg,et al.  Learning to Predict Gaze in Egocentric Video , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Hrishikesh D. Vinod Mathematica Integer Programming and the Theory of Grouping , 1969 .