Agent learning in supplier selection models

We use agent-based modeling to study the performance of a supplier selection model, originally proposed by Croson and Jacobides [Small Numbers Outsourcing: Efficient Procurement Mechanisms in a Repeated Agency Model, Working Paper #99- 05-04 Department of Operations and Information Management, The Wharton School of the University of Pennsylvania (1999)], which displays a complicated reward and punishment profile under incomplete information. We document the dynamics and convergence to equilibrium of the interactions of a single buyer with a heterogeneous group of sellers, which results in both separation of sellers capable of producing high-quality goods from those incapable of doing so, and continuing incentives for high-quality-capable sellers to produce at the maximum quality possible. We model two methods of determining exploration reference points--an "auction-style" model focusing on probability of success and a "newsvendor-style" model focusing on profitability. Our simulation shows that (1) the tournament structure suffices to reach convergence at high-quality levels whenever the number of suppliers exceeds three, (2) punishment length and number of suppliers are substitutes, and (3) shorter punishments improve learning speed of convergence. Moreover, we show that it is strictly better for the buyer to transact with relatively few suppliers--a conclusion generated endogenously inside the model as a tradeoff between exploration and exploitation, rather than through assumptions that explicitly penalize supplier proliferation.

[1]  R. Aumann Rationality and Bounded Rationality , 1997 .

[2]  D. Fudenberg,et al.  The Folk Theorem for Repeated Games with Discounting and Incomplete Information , 1998 .

[3]  Robert M. Townsend,et al.  Advances in Economic Theory: Arrow-Debreu programs as microfoundations of macroeconomics , 1987 .

[4]  C. Watkins Learning from delayed rewards , 1989 .

[5]  Tatsuya Kikutani,et al.  Risk absorption in Japanese subcontracting: A microeconometric study of the automobile industry , 1992 .

[6]  Jonathan Bendor,et al.  Aspiration-Based Reinforcement Learning in Repeated Games: An Overview , 2001 .

[7]  JoAnne Yates,et al.  Electronic markets and electronic hierarchies , 1987, CACM.

[8]  Eric K. Clemons,et al.  The Impact of Information Technology on the Organization of Economic Activity: The "Move to the Middle" Hypothesis , 1993, J. Manag. Inf. Syst..

[9]  Drew Fudenberg,et al.  The Folk Theorem in Repeated Games with Discounting or with Incomplete Information , 1986 .

[10]  Anatol Rapoport,et al.  Critiques of game theory , 2007 .

[11]  Patrick Rivett,et al.  Principles of Operations Research , 1972 .

[12]  Robert H. Crites,et al.  Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[13]  Vijay Gurbaxani,et al.  The impact of information systems on organizations and markets , 1991, CACM.

[14]  Leigh Tesfatsion,et al.  Guest editorial agent-based modeling of evolutionary economic systems , 2001, IEEE Trans. Evol. Comput..

[15]  Jayashankar M. Swaminathan,et al.  Modeling Supply Chain Dynamics: A Multiagent Approach , 1998 .

[16]  Nick Feltovich,et al.  Reinforcement-based vs. Belief-based Learning Models in Experimental Asymmetric-information Games , 2000 .

[17]  J. March,et al.  A Behavioral Theory of the Firm , 1964 .

[18]  Colin Camerer,et al.  Experience‐weighted Attraction Learning in Normal Form Games , 1999 .

[19]  J. Bakos,et al.  From vendors to partners: Information technology and incomplete contracts in buyer‐supplier relationships , 1993 .

[20]  Banri Asanuma Manufacturer-supplier relationships in Japan and the concept of relation-specific skill , 1989 .

[21]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[22]  M. Friedman Essays in Positive Economics , 1954 .

[23]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[24]  J. March Exploration and exploitation in organizational learning , 1991, STUDI ORGANIZZATIVI.

[25]  J. Neumann,et al.  Prisoner's Dilemma , 1993 .

[26]  Jonathan Bendor,et al.  Aspiration-Based Reinforcement Learning in Repeated Interaction Games: an Overview , 2001, IGTR.

[27]  Amihai Glazer,et al.  More monitoring can induce less effort , 1996 .

[28]  Nicholas R. Jennings,et al.  Learning to be Competitive in the Market , 1999, AAAI 1999.

[29]  Robert L. Axtell,et al.  WHY AGENTS? ON THE VARIED MOTIVATIONS FOR AGENT COMPUTING IN THE SOCIAL SCIENCES , 2000 .

[30]  Eric T. G. Wang,et al.  Electronic data interchange: competitive externalities and strategic implementation policies , 1995 .

[31]  Erik Brynjolfsson,et al.  Information technology, incentives and the optimal number of suppliers , 1993, Proceedings of ELECTRO '94.

[32]  Jeffrey H. Dyer,et al.  Strategic Supplier Segmentation: The Next “Best Practice” in Supply Chain Management , 1998 .

[33]  K. Judd Computational Economics and Economic Theory: Substitutes or Complements , 1997 .