Help by Predicting What to Do

Robots assisting humans with some specific tasks have been demonstrated on several occasions. A further challenging idea is to anticipate human needs by mining the future demand from the next action prediction. To trigger this anticipation mechanism a robot has to recognize what the human is doing now, foresee what the human will do next, and from their connection guesstimating what to do to help. We propose here a deep network combining the essential components of this challenging process leading to foreseeing the help that can be provided in human-robot collaboration.

[1]  Dima Damen,et al.  Scaling Egocentric Vision: The EPIC-KITCHENS Dataset , 2018, ArXiv.

[2]  Nanning Zheng,et al.  Where and Why are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Anirban Chakraborty,et al.  Context-Aware Activity Forecasting , 2014, ACCV.

[4]  Aude Billard,et al.  A dynamical system approach to task-adaptation in physical human–robot interaction , 2019, Auton. Robots.

[5]  Larry S. Davis,et al.  AVSS 2011 demo session: A large-scale benchmark dataset for event recognition in surveillance video , 2011, AVSS.

[6]  Silvio Savarese,et al.  A Hierarchical Representation for Future Action Prediction , 2014, ECCV.

[7]  E. Koechlin,et al.  Neural coding of prior expectations in hierarchical intention inference , 2017, Scientific Reports.

[8]  Antonio Torralba,et al.  Generating the Future with Adversarial Transformers , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Chris D. Nugent,et al.  From Activity Recognition to Intention Recognition for Assisted Living Within Smart Homes , 2017, IEEE Transactions on Human-Machine Systems.

[10]  Fiora Pirri,et al.  Deep Execution Monitor for Robot Assistive Tasks , 2018, ECCV Workshops.

[11]  Fiora Pirri,et al.  Visual Search and Recognition for Robot Task Execution and Monitoring , 2019, APPIS.

[12]  José Santos-Victor,et al.  Recognizing the grasp intention from human demonstration , 2015, Robotics Auton. Syst..

[13]  Bernt Schiele,et al.  A database for fine grained activity detection of cooking activities , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Stephen J. McKenna,et al.  Combining embedded accelerometers with computer vision for recognizing food preparation activities , 2013, UbiComp.

[15]  Caroline Catmur,et al.  Understanding intentions from actions: Direct perception, inference, and the roles of mirror and mentalizing systems , 2015, Consciousness and Cognition.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Hema Swetha Koppula,et al.  Learning human activities and object affordances from RGB-D videos , 2012, Int. J. Robotics Res..

[18]  Amit K. Roy-Chowdhury,et al.  Joint Prediction of Activity Labels and Starting Times in Untrimmed Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Peter Stone,et al.  DIPD: Gaze-Based Intention Inference in Dynamic Environments , 2018, AAAI Workshops.

[20]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[21]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[22]  Danica Kragic,et al.  Anticipating many futures: Online human motion prediction and synthesis for human-robot collaboration , 2017, ArXiv.

[23]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Rui Li,et al.  Human Intention Prediction in Human-Robot Collaborative Tasks , 2018, HRI.

[25]  Luís Moniz Pereira,et al.  State-of-the-art of intention recognition and its use in decision making , 2013, AI Commun..

[26]  Oliver Kroemer,et al.  Probabilistic movement primitives for coordination of multiple human–robot collaborative tasks , 2017, Auton. Robots.

[27]  Weimin Huang,et al.  Stacked hidden Markov model for motion intention recognition , 2017, 2017 IEEE 2nd International Conference on Signal and Image Processing (ICSIP).

[28]  Leonardo Fogassi,et al.  Neurophysiological bases underlying the organization of intentional actions and the understanding of others’ intention , 2013, Consciousness and Cognition.

[29]  Fiora Pirri,et al.  Discovery and recognition of motion primitives in human activities , 2017, PloS one.

[30]  Yazan Abu Farha,et al.  When will you do what? - Anticipating Temporal Occurrences of Activities , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Thomas Serre,et al.  The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.