Methods for Scalable and Safe Robot Learning

Robots are increasingly expected to go beyond controlled environments in laboratories and factories, to enter real-world public spaces and homes. However, robot behavior is still usually engineered ...

[1]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[2]  Min Bao,et al.  System-Level Techniques for Temperature-Aware Energy Optimization , 2010 .

[3]  Anna Andersson,et al.  Management information systems in process-oriented healthcare organisations , 2003 .

[4]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[5]  D. Mayne A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems , 1966 .

[6]  Andreas Borg Contributions to management and validation of non-functional requirements , 2004 .

[7]  Patrick Doherty,et al.  Deep Learning Quadcopter Control via Risk-Aware Active Learning , 2017, AAAI.

[8]  Sergey Levine,et al.  Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.

[9]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[10]  Owen Eriksson Informationssystem med verksamhetskvalitet : utvärdering baserat på ett verksamhetsinriktat och samskapande perspektiv , 1994 .

[11]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[12]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Vol. II , 1976 .

[13]  Eva L. Ragnemalm Towards student modelling through collaborative dialogue with a learning companion , 1995 .

[14]  Christer Hansson A prototype system for logical reasoning about time and action , 1990 .

[15]  Paul Pop,et al.  Scheduling and Communication Synthesis for Distributed Real-Time Systems , 2000, DAC 2000.

[16]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[17]  Jonas S. Karlsson Towards a strategy for software requirements selection , 1995 .

[18]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[19]  Anna Moberg Satellitkontor : en studie av kommunikationsmönster vid arbete på distans , 1993 .

[20]  K. Pettersson Informationssystemstrukturering, ansvarsfördelning och användarinflytande : en komparativ studie med utgångspunkt i två informationssystemstrategier , 1994 .

[21]  Sergey Levine,et al.  Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[22]  Joakim Eriksson Specifying and Managing Rules in an Active Real-Time Database System , 1998 .

[23]  Patrick Doherty,et al.  Model-predictive control with stochastic collision avoidance using Bayesian policy optimization , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Jonas S. Karlsson A Scalable Data Structure for A Parallel Data Server , 1997 .

[25]  Per-Arne Persson Toward a grounded theory for support of command and control in military coalitions , 1997 .

[26]  Zhiyuan He System-on-Chip Test Scheduling with Defect-Probability and Temperature Considerations , 2007 .

[27]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[28]  Peter Carlsson Separation av företagsledning och finansiering : fallstudier av företagsledarutköp ur ett agentteoretiskt perspektiv ... , 1994 .

[29]  J. Edvardsson Contributions to program- and specification-based test data generation , 2002 .

[30]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[31]  Kristina Larsen Förutsättningar och begränsningar för arbete på distans : erfarenheter från fyra svenska företag , 1996 .

[32]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[33]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[34]  Håkan Lundvall,et al.  Automatic Parallelization using Pipelining for Equation-Based Simulation Languages , 2008 .

[35]  Robert Kaminski Towards an XML document restructuring framework , 2007 .

[36]  Sham M. Kakade,et al.  A Natural Policy Gradient , 2001, NIPS.

[37]  Emma Hansson Optionsprogram för anställda : en studie av svenska börsföretag , 2001 .

[38]  Rickard Holsmark,et al.  Deadlock Free Routing inMesh Networks on Chip with Regions , 2009 .

[39]  Mikael Lind Affärsprocessinriktad förändringsanalys : utveckling och tillämpning av synsätt och metod , 1996 .

[40]  Anders Larsson,et al.  System-on-Chip Test Scheduling and Test Infrastructure Design , 2005 .

[41]  Ulf Johansson Rule extraction - the key to accurate and comprehensible data mining models , 2004 .

[42]  Qiang Liu,et al.  Dealing with Missing Mappings and Structure in a Network of Ontologies , 2011 .

[43]  Adrian Pop,et al.  Contributions to Meta-Modeling Tools and Methods , 2005 .

[44]  Massimiliano Raciti,et al.  Anomaly Detection and its Adaptation: Studies on Cyber-Physical Systems , 2013 .

[45]  Ling Lin,et al.  A Value-Based Indexing Technique for Time Sequences , 1997 .

[46]  Magnus Lindahl Bankens villkor i låneavtal vid kreditgivning till högt belånade företagsförvärv : en studie ur ett agentteoretiskt perspektiv , 2000 .

[47]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[48]  Henrik Nilsson,et al.  A Declarative Approach to Debugging for Lazy Functional Languages , 1994 .

[49]  Peter Jonsson Complexity of state-variable planning under structural restrictions , 1995 .

[50]  Jasper Snoek,et al.  Bayesian Optimization with Unknown Constraints , 2014, UAI.

[51]  Robert Sevenius On the instruments of governance : a law a economics study of capital instruments in limited liability companies , 2002 .

[52]  Gustaf Svedjemo,et al.  Ontology as Conceptual Schema when Modelling Historical Maps for Database Storage , 2007 .

[53]  Juha Takkinen,et al.  CAFE: Towards a Conceptual Model for Information Management in Electronic Mail , 1997 .

[54]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[55]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[56]  I. Chisalita Safety-oriented communication in mobile networks for vehicles , 2004 .

[57]  Fredrik Elg,et al.  Ett dynamiskt perspektiv på individuella skillnader av heuristisk kompetens, intelligens, mentala modeller, mål och konfidens i kontroll av mikrovärlden Moro , 2002 .

[58]  Jody Foo Computational Terminology : Exploring Bilingual and Monolingual Term Extraction , 2012 .

[59]  Anders Bäckström,et al.  Värdeskapande kreditgivning : kreditriskhantering ur ett agentteoretiskt perspektiv , 1998 .

[60]  Nicklas Bergfeldt Towards detached communication for robot cooperation , 2005 .

[61]  Mikael Nilsson,et al.  Efficient Temporal Reasoning with Uncertainty , 2015 .

[62]  Fredrika Berglund Management control and strategy : a case study of pharmaceutical drug development , 2002 .

[63]  Daniel Andreasson Slack-Time Aware Dynamic Routing Schemes for on-chip networks , 2007 .

[64]  M. Arntz,et al.  The Risk of Automation for Jobs in OECD Countries: A Comparative Analysis , 2016 .

[65]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[66]  Charlotte Björkegren,et al.  Learning for the next project : bearers and barriers in knowledge transfer within an organisation , 1999 .

[67]  Karl Hammar,et al.  Towards an Ontology Design Pattern Quality Model , 2013 .

[68]  Dan Lawesson Towards Behavioral Model Fault Isolation for Object Oriented Control Systems , 2001 .

[69]  Efstratios Gavves,et al.  Deep Reinforcement Learning in Pac-man , 2016 .

[70]  Martin Magnusson,et al.  Deductive Planning and Composite Actions in Temporal Action Logic , 2007 .

[71]  C. Tomlin,et al.  Closed-loop belief space planning for linear, Gaussian systems , 2011, 2011 IEEE International Conference on Robotics and Automation.

[72]  Patrick Doherty,et al.  Model-Based Reinforcement Learning in Continuous Environments Using Real-Time Constrained Optimization , 2015, AAAI.

[73]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Bayesian Optimization with Unknown Constraints , 2015, ICML.

[74]  Bengt E. W. Andersson Samverkande informationssystem mellan aktörer i offentliga åtaganden : en teori om aktörsarenor i samverkan om utbyte av information , 1998 .

[75]  Joakim Gustafsson,et al.  Extending temporal action logic , 2001 .

[76]  Torbjörn Näslund SLDFA-resolution : computing answers for negative queries , 1990 .

[77]  L. Blackmore,et al.  Convex Chance Constrained Predictive Control without Sampling , 2009 .