Relational activity processes for modeling concurrent cooperation

In human-robot collaboration, multi-agent domains, or single-robot manipulation with multiple end-effectors, the activities of the involved parties are naturally concurrent. Such domains are also naturally relational as they involve objects, multiple agents, and models should generalize over objects and agents. We propose a novel formalization of relational concurrent activity processes that allows us to transfer methods from standard relational MDPs, such as Monte-Carlo planning and learning from demonstration, to concurrent cooperation domains. We formally compare the formulation to previous propositional models of concurrent decision making and demonstrate planning and learning from demonstration methods on a real-world human-robot assembly task.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Marc Toussaint,et al.  Exploration in relational domains for model-based reinforcement learning , 2012, J. Mach. Learn. Res..

[3]  Sridhar Mahadevan,et al.  Learning to Take Concurrent Actions , 2002, NIPS.

[4]  De,et al.  Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.

[5]  Sridhar Mahadevan,et al.  Coarticulation: an approach for generating concurrent plans in Markov decision processes , 2005, ICML.

[6]  David E. Smith,et al.  Temporal Planning with Mutual Exclusion Reasoning , 1999, IJCAI.

[7]  Kristian Kersting,et al.  Gradient-based boosting for statistical relational learning: The relational dependency network case , 2011, Machine Learning.

[8]  Olivier Buffet,et al.  Concurrent Probabilistic Temporal Planning with Policy-Gradients , 2007, ICAPS.

[9]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[10]  Malte Helmert,et al.  Trial-Based Heuristic Tree Search for Finite Horizon MDPs , 2013, ICAPS.

[11]  Leslie Pack Kaelbling,et al.  Learning Planning Rules in Noisy Stochastic Worlds , 2005, AAAI.

[12]  Håkan L. S. Younes,et al.  Policy Generation for Continuous-time Stochastic Domains with Concurrency , 2004, ICAPS.

[13]  Mausam,et al.  Planning with Durative Actions in Stochastic Domains , 2008, J. Artif. Intell. Res..

[14]  Hendrik Blockeel,et al.  Top-Down Induction of First Order Logical Decision Trees , 1998, AI Commun..

[15]  Manuel Lopes,et al.  Inverse Reinforcement Learning in Relational Domains , 2015, IJCAI.

[16]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.