Viewpoint planning with transition management for active object recognition

Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. One of the most concerned problems in AOR is viewpoint planning (VP) which refers to developing a policy to determine the next viewpoints of the agent. A research trend is to solve the VP problem with reinforcement learning, namely to use the viewpoint transitions explored by the agent to train the VP policy. However, most research discards the trained transitions, which may lead to an inefficient use of the explored transitions. To solve this challenge, we present a novel VP method with transition management based on reinforcement learning, which can reuse the explored viewpoint transitions. To be specific, a learning framework of the VP policy is first established via the deterministic policy gradient theory, which provides an opportunity to reuse the explored transitions. Then, we design a scheme of viewpoint transition management that can store the explored transitions and decide which transitions are used for the policy learning. Finally, within the framework, we develop an algorithm based on twin delayed deep deterministic policy gradient and the designed scheme to train the VP policy. Experiments on the public and challenging dataset GERMS show the effectiveness of our method in comparison with several competing approaches.

[1]  Matthew B. Blaschko,et al.  A Consistent and Differentiable Lp Canonical Calibration Error Estimator , 2022, NeurIPS.

[2]  Xin Feng,et al.  A Deep Deterministic Policy Gradient Approach for Vehicle Speed Tracking Control With a Robotic Driver , 2022, IEEE Transactions on Automation Science and Engineering.

[3]  Zhong Yang,et al.  A State-Compensated Deep Deterministic Policy Gradient Algorithm for UAV Trajectory Tracking , 2022, Machines.

[4]  Tim Verbelen,et al.  Embodied Object Representation Learning and Recognition , 2022, Frontiers in Neurorobotics.

[5]  Thomas Parr,et al.  Generative Models for Active Vision , 2021, Frontiers in Neurorobotics.

[6]  Junhua Wang,et al.  Energy-Efficient Mode Selection and Resource Allocation for D2D-Enabled Heterogeneous Networks: A Deep Reinforcement Learning Approach , 2021, IEEE Transactions on Wireless Communications.

[7]  Changyin Sun,et al.  Deterministic Policy Gradient With Integral Compensator for Robust Quadrotor Control , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[8]  Hak-Keung Lam,et al.  Adaptive neuro-fuzzy PID controller based on twin delayed deep deterministic policy gradient algorithm , 2020, Neurocomputing.

[9]  Hao Xu,et al.  AUV path following controlled by modified Deep Deterministic Policy Gradient , 2020 .

[10]  Yong-Jin Liu,et al.  View planning in robot active vision: A survey of systems, algorithms, and applications , 2020, Computational Visual Media.

[11]  Huichun Hua,et al.  Agent-Based Modeling in Electricity Market Using Deep Deterministic Policy Gradient Algorithm , 2020, IEEE Transactions on Power Systems.

[12]  Kristen Grauman,et al.  End-to-End Policy Learning for Active Visual Categorization , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Václav Hlavác,et al.  Classification of Hanging Garments Using Learned Features Extracted from 3D Point Clouds , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  Daniel Rueckert,et al.  Automatic 3D bi-ventricular segmentation of cardiac images by a shape-refined multi-task deep learning approach , 2018, IEEE Transactions on Medical Imaging.

[15]  François Goulette,et al.  Paris-Lille-3D: A Point Cloud Dataset for Urban Scene Segmentation and Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Fuchun Sun,et al.  Active object recognition using hierarchical local-receptive-field-based extreme learning machine , 2018, Memetic Comput..

[17]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[18]  Fuchun Sun,et al.  Extreme Trust Region Policy Optimization for Active Object Recognition , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Dongbin Zhao,et al.  Deep Reinforcement Learning With Visual Attention for Vehicle Classification , 2017, IEEE Transactions on Cognitive and Developmental Systems.

[20]  Andreas Geiger,et al.  Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios? , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Garrison W. Cottrell,et al.  Belief tree search for active object recognition , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  Marc G. Bellemare,et al.  A Distributional Perspective on Reinforcement Learning , 2017, ICML.

[23]  Gaurav S. Sukhatme,et al.  Active multi-view object recognition: A unifying view on online feature selection and view planning , 2016, Robotics Auton. Syst..

[24]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[25]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[26]  Gamini Dissanayake,et al.  Active recognition and pose estimation of household objects in clutter , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[28]  Michael I. Jordan,et al.  Trust Region Policy Optimization , 2015, ICML.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Jean-Claude Latombe,et al.  Appearance-based motion strategies for object detection , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[32]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[33]  John K. Tsotsos,et al.  50 Years of object recognition: Directions forward , 2013, Comput. Vis. Image Underst..

[34]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[35]  Lucas Paletta,et al.  Active object recognition by view integration and reinforcement learning , 2000, Robotics Auton. Syst..

[36]  Markus Vincze,et al.  Viewpoint Evaluation for Online 3-D Active Object Classification , 2016, IEEE Robotics and Automation Letters.

[37]  Javier R. Movellan,et al.  Deep Q-learning for Active Recognition of GERMS: Baseline performance on a standardized dataset for active learning , 2015, BMVC.

[38]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[39]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .