暂无分享,去创建一个
Sandy H. Huang | Martina Zambelli | Jackie Kay | Murilo F. Martins | Yuval Tassa | Patrick M. Pilarski | Raia Hadsell | M. F. Martins | R. Hadsell | Yuval Tassa | P. Pilarski | Martina Zambelli | Jackie Kay | M. Martins
[1] Liang-Boon Wee,et al. On the dynamics of contact between space robots and configuration control for impact minimization , 1993, IEEE Trans. Robotics Autom..
[2] P. Scheidt,et al. The epidemiology of nonfatal injuries among US children and youth. , 1995, American journal of public health.
[3] E. Altman. Constrained Markov Decision Processes , 1999 .
[4] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[5] M. Špinka,et al. Mammalian Play: Training for the Unexpected , 2001, The Quarterly Review of Biology.
[6] Koji Ikuta,et al. Safety Evaluation Method of Design and Control for Human-Care Robots , 2003, Int. J. Robotics Res..
[7] Stuart J. Russell,et al. Q-Decomposition for Reinforcement Learning Agents , 2003, ICML.
[8] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[9] Yangsheng Xu,et al. Configuration Control of Space Robots for Impact Minimization , 2006, 2006 IEEE International Conference on Robotics and Biomimetics.
[10] E. B. H. Sandseter. Categorising risky play—how can we identify risk‐taking in children's play? , 2007 .
[11] B. Morrongiello,et al. Understanding children's injury-risk behaviors: the independent contributions of cognitions and emotions. , 2007, Journal of pediatric psychology.
[12] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[13] Pierre-Yves Oudeyer,et al. What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.
[14] Christoph H. Lampert,et al. Learning Dynamic Tactile Sensing With Robust Vision-Based Training , 2011, IEEE Transactions on Robotics.
[15] J. A. Fishel,et al. Sensing tactile microvibrations with the BioTac — Comparison with human sensitivity , 2012, 2012 4th IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob).
[16] M. Brussoni,et al. Risky Play and Children’s Safety: Balancing Priorities for Optimal Child Development , 2012, International journal of environmental research and public health.
[17] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[18] Jürgen Schmidhuber,et al. Learning skills from play: Artificial curiosity on a Katana robot arm , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[19] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[20] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[21] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[22] Lihui Wang,et al. Minimizing Energy Consumption for Robot Arm Movement , 2014 .
[23] Pierre-Yves Oudeyer,et al. The effects of task difficulty, novelty and the size of the search space on intrinsically motivated exploration , 2014, Front. Neurosci..
[24] Shigeki Sugano,et al. Tactile object recognition using deep learning and dropout , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.
[25] Tomás Svoboda,et al. Safe Exploration Techniques for Reinforcement Learning - An Overview , 2014, MESAS.
[26] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[27] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[28] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[29] Dewen Hu,et al. Multiobjective Reinforcement Learning: A Comprehensive Overview , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[30] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[31] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[32] Christopher K. Hsee,et al. The Pandora Effect , 2016, Psychological science.
[33] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[34] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[35] Martina Zambelli,et al. Multimodal imitation using self-learned sensorimotor representations , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[36] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[37] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[38] Gordon Cheng,et al. A Tactile-Based Framework for Active Object Learning and Discrimination using Multimodal Robotic Skin , 2017, IEEE Robotics and Automation Letters.
[39] Jingchen Hu,et al. Pre-Impact Configuration Designing of a Robot Manipulator for Impact Minimization , 2017 .
[40] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[41] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[42] Justin Fu,et al. EX2: Exploration with Exemplar Models for Deep Reinforcement Learning , 2017, NIPS.
[43] S. Shankar Sastry,et al. Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning , 2017, ArXiv.
[44] Yuichiro Yoshikawa,et al. Intrinsically motivated reinforcement learning for human-robot interaction in the real-world , 2018, Neural Networks.
[45] Misha Denil,et al. Learning Awareness Models , 2018, ICLR.
[46] Yuval Tassa,et al. Safe Exploration in Continuous Action Spaces , 2018, ArXiv.
[47] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[48] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.