Automated strategy searches in an electronic goods market: learning and complex price schedules

In an automated market for electronic goods new problems arise that have not been well studied previously. For example, information goods are very flexible. Marginal costs are negligible and nearly limitless bundling and unbundling of these items are possible, in contrast to physical goods. Consequently, producers can offer complex pricing schemes. However, the profit-maximizing design of a complex pricing schedule depends on a producer's knowledge of the distribution of consumer preferences for the available information goods. Preferences are private and can only be gradually uncovered through market experience. In this paper we compare dynamic performance across price schedules of varying complexity. We provide the producer with two machine learning methods producer that is performing a naive, knowledge-free form of leanings (function approximation and hill-climbing) which implement a strategy that balances exploitation to maximize current profits against exploration of the profit landscape to improve future profits. We find that the tradeoff between exploitation and exploration is different depending on the learning algorithms employed, and in particular depending on the complexity of the price schedule that if offered. In general, simpler price schedules are more robust and give up less profit during the learning periods even though in our stationary environment learning eventually is complete and the more complex schedules have high long-run profits. These results hold for both learning methods, even though the relative performance of the methods is quite sensitive to choice of initial conditions and differences in the smoothness of the profit landscape for different price schedules. Our results have implications for automated learning and strategic pricing in non-stationary environments, which arise when the consumer population changes, individuals change their preferences, or competing firms change their strategies.

[1]  Hayne E. Leland,et al.  Optimal Nonuniform Prices , 1984 .

[2]  Rajarshi Das,et al.  Two-Sided Learning in an Agent Economy for Information Bundles , 1999, Agent Mediated Electronic Commerce.

[3]  H. Varian,et al.  Network Delivery of Information Goods: Optimal Pricing of Articles and Subscriptions , 2000 .

[4]  David J. Braden,et al.  Nonlinear Pricing to Produce Information , 1994 .

[5]  Edmund H. Durfee,et al.  The moving target function problem in multi-agent learning , 1998, Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160).

[6]  Raghuram Iyengar,et al.  Nonlinear pricing , 2022 .

[7]  David Carmel,et al.  Opponent Modeling in Multi-Agent Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[8]  Michael P. Wellman,et al.  Online learning about other agents in a dynamic multiagent system , 1998, AGENTS '98.

[9]  M. Rothschild A two-armed bandit theory of market pricing , 1974 .

[10]  W. Oi A Disneyland Dilemma: Two-Part Tariffs for a Mickey Mouse Monopoly , 1971 .

[11]  Edmund H. Durfee,et al.  Learning nested agent models in an information economy , 1998, J. Exp. Theor. Artif. Intell..

[12]  William H. Press,et al.  Numerical recipes , 1990 .

[13]  F. Walters Sequential Simplex Optimization - An Update , 1999 .

[14]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[15]  M. Weitzman Optimal search for the best alternative , 1978 .

[16]  E. Maskin,et al.  Monopoly with Incomplete Information , 1984 .

[17]  Yannis Bakos,et al.  Bundling Information Goods: Pricing, Profits and Efficiency , 1998 .

[18]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[19]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .