Automated Action Evaluation for Robotic Imitation Learning via Siamese Neural Networks

Despite recent advances in video-guided robotic imitation learning, many methods still rely on human experts to provide sparse rewards that indicate whether robots have successfully completed tasks. The challenge of enabling robots to autonomously evaluate whether their actions can complete complex, multi-stage tasks remains unresolved. In this work, we propose an efficient few-shot robotic learning algorithm that centres around learning and evaluating from a third-person perspective to address the aforementioned challenge. We develop a novel Siamese neural network-based robotic action-state evaluation system, named “Behavior-Outcome Dual Assessment” (BODA), in our robotic imitation learning system, so as to replace artificial evaluations from human experts in multi-stage imitation learning processes and to improve learning efficiency. In this way, one video demonstration of a target task is divided into several stages. For each stage, we design two Siamese neural network-based evaluation modules in BODA: One module focuses on action changes, and the other handles working environment changes. The two modules work together to provide a comprehensive assessment of the robot's completion of each stage from the view of both the action and working environment changes. Then, BODA is integrated within a model-based reinforcement learning framework to enable the completion of our imitation learning cycle. Extensive experiments demonstrate that the evaluation processes of BODA can automatically and accurately evaluate task completion status without human intervention. In contrast to conventional methods, BODA is able to keep the accumulation of errors within acceptable limits through self-assessment in stages.

[1]  Fei Chao,et al.  Robotic Action-state Evaluation via Siamese Neural Network , 2022, UKRAS22 Conference "Robotics for Unconstrained Environments" Proceedings.

[2]  M. H. Khan,et al.  An Anomaly Detection System via Moving Surveillance Robots with Human Collaboration , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[3]  C. L. P. Chen,et al.  Enhanced Broad Siamese Network for Facial Emotion Recognition in Human–Robot Interaction , 2021, IEEE Transactions on Artificial Intelligence.

[4]  Changle Zhou,et al.  Error Controlled Actor-Critic , 2021, Inf. Sci..

[5]  Jianhua Ma,et al.  Siamese Neural Network Based Few-Shot Learning for Anomaly Detection in Industrial Cyber-Physical Systems , 2021, IEEE Transactions on Industrial Informatics.

[6]  Sergey Levine,et al.  Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Chelsea Finn,et al.  Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos , 2021, Robotics: Science and Systems.

[8]  Alexei A. Efros,et al.  Learning Cross-Domain Correspondence for Control with Dynamics Cycle-Consistency , 2020, ICLR.

[9]  S. Levine,et al.  Reinforcement Learning with Videos: Combining Offline Observations with Interaction , 2020, CoRL.

[10]  Sudeep Dasari,et al.  Transformers for One-Shot Visual Imitation , 2020, CoRL.

[11]  Chelsea Finn,et al.  Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors , 2020, NeurIPS.

[12]  Li Fei-Fei,et al.  Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations , 2020, Robotics: Science and Systems.

[13]  Pieter Abbeel,et al.  AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos , 2019, Robotics: Science and Systems.

[14]  Deepak Pathak,et al.  Third-Person Visual Imitation Learning via Decoupled Hierarchical Controller , 2019, NeurIPS.

[15]  Bharadwaj S. Amrutur,et al.  One-Shot Object Localization Using Learnt Visual Cues via Siamese Networks , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Sergey Levine,et al.  Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.

[17]  Byron Boots,et al.  Provably Efficient Imitation Learning from Observation Alone , 2019, ICML.

[18]  Sergey Levine,et al.  Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control , 2018, ArXiv.

[19]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[20]  Sergey Levine,et al.  SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning , 2018, ICML.

[21]  Sergey Levine,et al.  QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[22]  Sergey Levine,et al.  Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[23]  Chih-Min Lin,et al.  Generative Adversarial Nets in Robotic Chinese Calligraphy , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[24]  P. Stone,et al.  Behavioral Cloning from Observation , 2018, IJCAI.

[25]  Yannick Schroecker,et al.  Imitating Latent Policies from Observation , 2018, ICML.

[26]  David Filliat,et al.  State Representation Learning for Control: An Overview , 2018, Neural Networks.

[27]  Ian Taylor,et al.  Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Sergey Levine,et al.  Time-Contrastive Networks: Self-Supervised Learning from Video , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[30]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[31]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[33]  Davide Chicco,et al.  Siamese Neural Networks: An Overview , 2021, Artificial Neural Networks, 3rd Edition.