论文信息 - Imputing a convex objective function

Imputing a convex objective function

We consider an optimizing process (or parametric optimization problem), i.e., an optimization problem that depends on some parameters. We present a method for imputing or estimating the objective function, based on observations of optimal or nearly optimal choices of the variable for several values of the parameter, and prior knowledge (or assumptions) about the objective. Applications include estimation of consumer utility functions from purchasing choices, estimation of value functions in control problems, given observations of an optimal (or just good) controller, and estimation of cost functions in a flow network.

Stephen P. Boyd | Yang Wang | Arezou Keshavarz | Yang Wang | Arezou Keshavarz

[1] R. E. Kalman,et al. When Is a Linear Control System Optimal , 1964 .

[2] R. Bellman. Dynamic programming. , 1957, Science.

[3] R. Weiner. Lecture Notes in Economics and Mathematical Systems , 1985 .

[4] Philippe L. Toint,et al. On an instance of the inverse shortest paths problem , 1992, Math. Program..

[5] Stephen P. Boyd,et al. Linear Matrix Inequalities in Systems and Control Theory , 1994 .

[6] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[7] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[8] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[9] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .

[10] P. Toint,et al. The inverse shortest paths problem with upper bounds on shortest paths costs , 1997 .

[11] J. W. Nieuwenhuis,et al. Boekbespreking van D.P. Bertsekas (ed.), Dynamic programming and optimal control - volume 2 , 1999 .

[12] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[13] Jan M. Maciejowski,et al. Predictive control : with constraints , 2002 .

[14] Alberto Bemporad,et al. The explicit linear quadratic regulator for constrained systems , 2003, Autom..

[15] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..

[16] C. L. Benkard,et al. Estimating Dynamic Models of Imperfect Competition , 2004 .

[17] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[18] Thomas D. Nielsen,et al. Learning a decision maker's utility function from (possibly) inconsistent behavior , 2004, Artif. Intell..

[19] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[20] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.

[21] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[22] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.

[23] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[24] Csaba Szepesvári,et al. Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.

[25] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.

[26] Steven T. Berry,et al. Chapter 63 Econometric Tools for Analyzing Market Outcomes , 2007 .

[27] Edward E. Leamer,et al. Econometric Tools for Analyzing Market Outcomes , 2007 .

[28] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[29] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[30] Stephen P. Boyd,et al. Fast Model Predictive Control Using Online Optimization , 2010, IEEE Transactions on Control Systems Technology.

[31] Stephen P. Boyd,et al. Fast Evaluation of Quadratic Control-Lyapunov Policy , 2011, IEEE Transactions on Control Systems Technology.

[32] Vivek F. Farias,et al. Approximate Dynamic Programming via a Smoothed Linear Program , 2009, Oper. Res..