Integrating On-policy Reinforcement Learning with Multi-agent Techniques for Adaptive Service Composition

In service computing, online services and the Internet environment are evolving over time, which poses a challenge to service composition for adaptivity. In addition, high efficiency should be maintained when faced with massive candidate services. Consequently, this paper presents a new model for large-scale and adaptive service composition based on multi-agent reinforcement learning. The model integrates on-policy reinforcement learning and game theory, where the former is to achieve adaptability in a highly dynamic environment with good online performance, and the latter is to enable multiple agents to work for a common task (i.e., composition). In particular, we propose a multi-agent SARSA (State-Action-Reward-State-Action) algorithm which is expected to achieve better performance compared with the single-agent reinforcement learning methods in our composition framework. The features of our approach are demonstrated by an experimental evaluation.

[1]  Haiyan Zhao,et al.  A Multi-agent Learning Model for Service Composition , 2012, 2012 IEEE Asia-Pacific Services Computing Conference.

[2]  Pascal Poizat,et al.  Automated Service Composition with Adaptive Planning , 2008, ICSOC.

[3]  Minjie Zhang,et al.  Multi-Objective Service Composition Using Reinforcement Learning , 2013, ICSOC.

[4]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[5]  Marco Saerens,et al.  Dynamic Web Service Composition within a Service-Oriented Architecture , 2007, IEEE International Conference on Web Services (ICWS 2007).

[6]  Danilo Ardagna,et al.  Adaptive Service Composition in Flexible Processes , 2007, IEEE Transactions on Software Engineering.

[7]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8]  Soundar R. T. Kumara,et al.  Effective Web Service Composition in Diverse and Large-Scale Service Networks , 2008, IEEE Transactions on Services Computing.

[9]  Hongbing Wang,et al.  A Novel Approach to Large-Scale Services Composition , 2013, APWeb.

[10]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[11]  H. Young,et al.  The Evolution of Conventions , 1993 .

[12]  L. Shapley,et al.  Fictitious Play Property for Games with Identical Interests , 1996 .

[13]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[14]  Huaglory Tianfield,et al.  Decentralized multi-agent service composition , 2013, Multiagent Grid Syst..

[15]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[16]  Michael L. Littman,et al.  Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[17]  Xiang Zhou,et al.  Adaptive Service Composition Based on Reinforcement Learning , 2010, ICSOC.

[18]  Xiaofeng Wang,et al.  Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.

[19]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[20]  Ville Könönen,et al.  Asymmetric multiagent reinforcement learning , 2003, Web Intell. Agent Syst..

[21]  Kwang Mong Sim,et al.  Agent-Based Service Composition in Cloud Computing , 2010, FGIT-GDC/CA.

[22]  Zakaria Maamar,et al.  Toward an agent-based and context-oriented approach for Web services composition , 2005, IEEE Transactions on Knowledge and Data Engineering.