Applying SAQ-Learning Algorithm for Trading Agents in Bilateral Bargaining

In this research we use a learning method called SAQ-Learning to use for agents in a single-issue bargaining process. SAQ-Learning algorithm is an improved version of Q-Learning algorithm that benefits from the Metropolis criterion of Simulated Annealing (SA) algorithm to overcome the challenge of finding a balance between exploration and exploitation. Q-Learning is one the most important types of Reinforcement Learning (RL) because of the fact that it does not need the transition model of the environment. Artificial Intelligence (AI) approaches have attracted interest in solving bargaining problem. This is because Game Theory (GT) needs some unrealistic assumptions to solve bargaining problem. Presence of perfectly rational agents is an example of these assumptions. Therefore by designing SAQ-Learning agents to bargain with each other over price, we gained higher performance in case of settlement rate, average payoff, and the time an agent needs to find his optimal policy. This learning method can be a suitable learning algorithm for automated online bargaining agents in e-commerce.

[1]  Ana Paula Rocha,et al.  Agents Advanced Features for Negotiation in Electronic Commerce and Virtual Organisations Formation Processes , 2001, AgentLink.

[2]  Fu-Ren Lin,et al.  Using temporal-difference learning for multi-agent bargaining , 2008, Electron. Commer. Res. Appl..

[3]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[4]  Katia P. Sycara,et al.  Bayesian learning in negotiation , 1998, Int. J. Hum. Comput. Stud..

[5]  Von-Wun Soo,et al.  On-line incremental learning in bilateral multi-issue negotiation , 2002, AAMAS '02.

[6]  Shu Yang,et al.  The research of decision-make based on IGA in agent-oriented multi-issue automated negotiation , 2010, Proceedings of the 29th Chinese Control Conference.

[7]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[8]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Fang Fang,et al.  Applying hybrid case-based reasoning in agent-based negotiations for supply chain management , 2010, Expert Syst. Appl..

[11]  Fillia Makedon,et al.  A hybrid negotiation strategy mechanism in an automated negotiation system , 2004, EC '04.

[12]  Zhong Chen,et al.  Online Bargaining as a Form of Dynamic Pricing and the Sellers' Advantage from Information Assymmetry , 2005, AMCIS.

[13]  Bruce Spencer,et al.  A Bayesian classifier for learning opponents' preferences in multi-object automated negotiation , 2007, Electron. Commer. Res. Appl..

[14]  Taiguang Gao,et al.  Research on the decision-making of multi-issue/attribute negotiation based on agent technology and the Genetic Algorithm , 2010, 2010 Chinese Control and Decision Conference.

[15]  Takayuki Ito,et al.  Issue Clustering and Distributed Genetic Algorithms for Multi-issue Negotiations , 2010, 2010 IEEE/ACIS 9th International Conference on Computer and Information Science.

[16]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[17]  R. Bellman A Markovian Decision Process , 1957 .

[18]  Nikos Vlassis,et al.  A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence I Mobk077-fm Synthesis Lectures on Artificial Intelligence and Machine Learning a Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence a Concise Introduction to Multiagent Systems and D , 2007 .

[19]  Yang Liu,et al.  A new Q-learning algorithm based on the metropolis criterion , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[21]  Ingrid N. Haugen,et al.  Game theory : strategies, equilibria, and theorems , 2009 .

[22]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .