Learning by doing and the value of optimal experimentation

Research on learning-by-doing has typically been restricted to cases where estimation and control can be treated separately. Recent work has provided convergence results for more general learning problems where experimentation is an important aspect of optimal control. However the associated optimal policy cannot be derived analytically because Bayesian learning introduces a nonlinearity in the dynamic programming problem. This paper characterizes the optimal policy numerically and shows that it incorporates a substantial degree of experimentation. Dynamic simulations indicate that optimal experimentation dramatically improves the speed of learning, while separating control and estimation frequently induces a long-lasting bias in the control and target variables.

[1]  N. Kiefer,et al.  Optimal Control of an Unknown Linear Process with Learning , 1989 .

[2]  Alfred L. Norman,et al.  Multiple relative maxima in optimal macroeconomic policy: an illustration , 1979 .

[3]  Alfred L. Norman,et al.  FIRST ORDER DUAL CONTROL , 1976 .

[4]  David A. Kendrick,et al.  Stochastic control for economic models , 1981 .

[5]  Ronald J. Balvers,et al.  Actively Learning about Demand and the Dynamics of Price Adjustment , 1990 .

[6]  David A. Kendrick,et al.  Non-convexities from probing in adaptive control problems☆ , 1978 .

[7]  A. Zellner An Introduction to Bayesian Inference in Econometrics , 1971 .

[8]  J. A. Bather,et al.  Optimization of Stochastic Systems: Topics in Discrete-Time Dynamics , 1989 .

[9]  A. McLennan Price dispersion and incomplete learning in the long run , 1984 .

[10]  Yaw Nyarko On the convergence of Bayesian posterior processes in linear economic models Counting equations and unknowns , 1991 .

[11]  Douglas W. Dwyer,et al.  A Bayesian Learning Model Fitted to a Variety of Empirical Learning Curves , 1995 .

[12]  John B. Taylor Asymptotic Properties of Multiperiod Control Rules in the Linear Regression Model , 1974 .

[13]  B. McCarl,et al.  Economics , 1870, The Indian medical gazette.

[14]  John B. Taylor,et al.  Methods of Efficient Parameter Estimation in Control Problems , 1976 .

[15]  Marco P. Tucci The Nonconvexities Problem in Adaptive Control Models: A Simple Computational Solution , 1998 .

[16]  T. W. Anderson,et al.  Some Experimental Results on the Statistical Properties of Least Squares Estimates in Control Problems , 1976 .

[17]  E. Prescott THE MULTI-PERIOD CONTROL PROBLEM UNDER UNCERTAINTY , 1972 .

[18]  T. Lai,et al.  Least Squares Estimates in Stochastic Regression Models with Applications to Identification and Control of Dynamic Systems , 1982 .

[19]  Thomas F. Cosimano,et al.  Inflation Variability and Gradualist Monetary Policy , 1994 .

[20]  E. Tse,et al.  Further comments on "Adaptive stochastic control for a class of linear systems" , 1972 .

[21]  R. Sundaram,et al.  Bayesian Economist... Bayesian Agents I: An Alternative Approach to Optimal Learning , 1993 .

[22]  Daniel Trefler The Ignorant Monopolist: Optimal Learning with Endogenous Information , 1993 .

[23]  Y. Bar-Shalom Stochastic dynamic programming: Caution and probing , 1981 .

[24]  B. Jullien,et al.  OPTIMAL LEARNING BY EXPERIMENTATION , 1991 .

[25]  Marco P. Tucci Adaptive Control in the Presence of Time-Varying Parameters , 1989 .

[26]  Yaakov Bar-Shalom,et al.  An actively adaptive control for linear systems with random parameters via the dual control approach , 1972, CDC 1972.

[27]  青木 正直 Optimization of stochastic systems : topics in discrete-time dynamics , 1967 .

[28]  Bruce Mizrach,et al.  Nonconvexities in a stochastic control problem with learning , 1991 .

[29]  Hans M. Amman,et al.  Active learning: A correction , 1997 .

[30]  Nicholas M. Kiefer A value function arising in the economics of information , 1989 .

[31]  N. Kiefer,et al.  Controlling a Stochastic Process with Unknown Parameters , 1988 .

[32]  Thomas F. Cosimano,et al.  Periodic learning about a hidden state variable , 1993 .

[33]  Boyan Jovanovic,et al.  The Bayesian Foundations of Learning by Doing , 1994 .

[34]  Hans M. Amman,et al.  Nonconvexities in Stochastic Control Models. , 1995 .

[35]  M. Rosenzweig,et al.  Learning by Doing and Learning from Others: Human Capital and Technical Change in Agriculture , 1995, Journal of Political Economy.

[36]  Boyan Jovanovic,et al.  Learning By Doing and the Choice of Technology , 1994 .

[37]  M. Rothschild A two-armed bandit theory of market pricing , 1974 .

[38]  K. Judd Numerical methods in economics , 1998 .

[39]  Hans M. Amman,et al.  Active learning Monte Carlo results , 1994 .

[40]  A. Rustichini,et al.  Learning about variable demand in the long run , 1995 .

[41]  Yaakov Bar-Shalom,et al.  An actively adaptive control for linear systems with random parameters , 1973 .

[42]  D. Kendrick Caution and probing in a macroeconomic model , 1982 .

[43]  Michael Spagat,et al.  Learning, Experimentation and Monetary Policy , 1993 .