论文信息 - Robotic Object Sorting via Deep Reinforcement Learning: a generalized approach

Robotic Object Sorting via Deep Reinforcement Learning: a generalized approach

This work proposes a general formulation for the Object Sorting problem, suitable to describe any non-deterministic environment characterized by friendly and adversarial interference. Such an approach, coupled with a Deep Reinforcement Learning algorithm, allows training policies to solve different sorting tasks without adjusting the architecture or modifying the learning method. Briefly, the environment is subdivided into a clutter, where objects are freely located, and a set of clusters, where objects should be placed according to predefined ordering and classification rules. A 3D grid discretizes such environment: the properties of an object within a cell depict its state. Such attributes include object category and order. A Markov Decision Process formulates the problem: at each time step, the state of the cells fully defines the environment's one. Users can custom-define object classes, ordering priorities, and failure rules. The latter by assigning a non-uniform risk probability to each cell. Performed experiments successfully trained and validated a Deep Reinforcement Learning model to solve several sorting tasks while minimizing the number of moves and failure probability. Obtained results demonstrate the capability of the system to handle non-deterministic events, like failures, and unpredictable external disturbances, like human user interventions.

[1] Lydia E. Kavraki,et al. Platform-Independent Benchmarks for Task and Motion Planning , 2018, IEEE Robotics and Automation Letters.

[2] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.

[3] Jörg Hoffmann,et al. FF: The Fast-Forward Planning System , 2001, AI Mag..

[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[6] Joelle Pineau,et al. Constrained Markov Decision Processes via Backward Value Functions , 2020, ICML.

[7] Bojan Jerbić,et al. A Reinforcement Learning Based Algorithm for Robot Action Planning , 2018, Advances in Service and Industrial Robotics.

[8] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.

[9] J. Rosell,et al. Robot tasks sequence planning using Petri nets , 2003, Proceedings of the IEEE International Symposium onAssembly and Task Planning, 2003..

[10] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[11] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[12] Lucian Busoniu,et al. Sorting objects from a conveyor belt using active perception with a POMDP model , 2019, 2019 18th European Control Conference (ECC).

[13] E. Altman. Constrained Markov Decision Processes , 1999 .

[14] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[15] Pascal Poupart,et al. Partially Observable Markov Decision Processes , 2010, Encyclopedia of Machine Learning.

[16] Carme Torras,et al. POMDP approach to robotized clothes separation , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17] Morgan Quigley,et al. ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[18] Stefano Ghidoni,et al. Robot Task Planning via Deep Reinforcement Learning: a Tabletop Object Sorting Application , 2019, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC).