Bio-inspired robot perception coupled with robot-modeled human perception

Research statement – part 1: My overarching research goal is to provide robots with perceptional abilities that allow interactions with humans in a human-like manner. To develop these perceptional abilities, I believe that it is useful to study the principles of the human visual system. I use these principles to develop new computer vision algorithms and validate their effectiveness in intelligent robotic systems. I am enthusiastic about this approach as it offers the dual benefit of uncovering principles inherent in the human visual system, as well as applying these principles to its artificial counterpart. Fig. 1 contains a depiction of my research. Perspective-taking: In our everyday lives, we often interact with other people. Although each interaction is different and hard to predict in advance, they are usually fluid and efficient. This is because humans take many aspects into account when interacting with each other: the relationship between the interlocutors, their familiarity with the topic, and the time and location of the interaction, among many others. More specifically, humans are remarkably good at rapidly forming models of others and adapting their actions accordingly. To form these models, humans exploit the ability to take on someone else’s point of view – they take their perspective [6]. In [7], we introduced an artificial visual system that equips an iCub humanoid robot with the ability to perform perspectivetaking in unknown environments using a depth camera mounted above the robot, i.e. without using a motion capture system or fiducial markers. Grounded in psychological studies [19, 12], perspective-taking is separated into two processes: level 1 perspective-taking comprises the ability to identify objects which are occluded from one perspective but not the other; this was implemented using line-of-sight tracing. Level 2 perspective-taking refers to understanding how the object is perceived from the other perspective (rather than just understanding what is visible from that perspective; see [19]); this was implemented using a mental rotation process. While [7] implements perspective-taking in a robotic system, it does not provide any insights into the underlying mechanisms used by humans. In [8], we investigated possible implementations of perspective-taking in the human visual system using a computational model applied to a simulated robot. The model proposes that a mental rotation of the self, also termed embodied transformation, accounts for this ability. The computational model reproduces the reaction times of human subjects in several experiments and explains gender differences that were observed in human subjects. In future works, I would like to explore the relationship between perspective-taking and active vision. Active vision 0 20 40 60 80 100

[1]  Yiannis Demiris,et al.  Perspective taking in robots: A framework and computational model , 2018 .

[2]  Yiannis Demiris,et al.  Computational Modeling of Embodied Visual Perspective Taking , 2020, IEEE Transactions on Cognitive and Developmental Systems.

[3]  Yiannis Demiris,et al.  RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments , 2018, ECCV.

[4]  Qiong Huang,et al.  TabletGaze: dataset and analysis for unconstrained appearance-based gaze estimation in mobile tablets , 2017, Machine Vision and Applications.

[5]  Chiara Bartolozzi,et al.  Event-Based Vision: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Michael Milford,et al.  Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jorge Dias,et al.  A Bayesian hierarchy for robust gaze estimation in human-robot interaction , 2017, Int. J. Approx. Reason..

[8]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[9]  J. Flavell,et al.  Young children's knowledge about visual perception: Further evidence for the Level 1–Level 2 distinction. , 1981 .

[10]  Michael Milford,et al.  Event-Based Visual Place Recognition With Ensembles of Temporal Windows , 2020, IEEE Robotics and Automation Letters.

[11]  Michael Milford,et al.  Where is your place, Visual Place Recognition? , 2021, IJCAI.

[12]  Michael Milford,et al.  Fast and Robust Bio-inspired Teach and Repeat Navigation , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Yiannis Demiris,et al.  RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[14]  Garrick Orchard,et al.  Advancing Neuromorphic Computing With Loihi: A Survey of Results and Outlook , 2021, Proceedings of the IEEE.

[15]  Yiannis Demiris,et al.  Markerless perspective taking for humanoid robots in unconstrained environments , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Vladlen Koltun,et al.  High Speed and High Dynamic Range Video with an Event Camera , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Giulio Sandini,et al.  Eye gaze tracking for a humanoid robot , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[18]  Mario Fritz,et al.  MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Wojciech Matusik,et al.  Eye Tracking for Everyone , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jeffrey M. Zacks,et al.  Two kinds of visual perspective taking , 2006, Perception & psychophysics.

[21]  John K. Tsotsos,et al.  Revisiting active perception , 2016, Autonomous Robots.

[22]  Yukie Nagai,et al.  Yet another gaze detector: An embodied calibration free system for the iCub robot , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).