Online learning and optimization of Markov jump linear models
暂无分享,去创建一个
[1] Arnaud Doucet,et al. Particle filters for state estimation of jump Markov linear systems , 2001, IEEE Trans. Signal Process..
[2] Assaf J. Zeevi,et al. Dynamic Pricing with an Unknown Demand Model: Asymptotically Optimal Semi-Myopic Policies , 2014, Oper. Res..
[3] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[4] A. Zeevi,et al. Non-Stationary Stochastic Optimization , 2014 .
[5] T. Lai. Asymptotically efficient adaptive control in stochastic regression models , 1986 .
[6] Lang Tong,et al. Retail pricing for stochastic demand with unknown parameters: An online machine learning approach , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[7] S. Boyd,et al. Pricing and learning with uncertain demand , 2003 .
[8] Vikram Krishnamurthy,et al. Expectation maximization algorithms for MAP estimation of jump Markov linear systems , 1999, IEEE Trans. Signal Process..
[9] T. W. Anderson,et al. Some Experimental Results on the Statistical Properties of Least Squares Estimates in Control Problems , 1976 .
[10] Josef Broder,et al. Dynamic Pricing Under a General Parametric Choice Model , 2012, Oper. Res..
[11] J. Spall. Implementation of the simultaneous perturbation algorithm for stochastic optimization , 1998 .
[12] R. Gill,et al. Applications of the van Trees inequality : a Bayesian Cramr-Rao bound , 1995 .
[13] D. Bertsimas,et al. Working Paper , 2022 .
[14] H. Robbins,et al. Iterated least squares in multiperiod control , 1982 .
[15] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.
[16] Eric W. Cope,et al. Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems , 2009, IEEE Transactions on Automatic Control.
[17] Frank Thomson Leighton,et al. The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..
[18] T. Lai,et al. Asymptotically efficient self-tuning regulators , 1987 .
[19] J. Spall. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .
[20] Yossi Aviv,et al. A Partially Observed Markov Decision Process for Dynamic Pricing , 2005, Manag. Sci..
[21] Gang George Yin,et al. How does a stochastic optimization/approximation algorithm adapt to a randomly evolving optimum/root with jump Markov sample paths , 2009, Math. Program..
[22] R. Agrawal. The Continuum-Armed Bandit Problem , 1995 .
[23] Omar Besbes,et al. Online Companion: Non-stationary Stochastic Optimization , 2015 .
[24] R. P. Marques,et al. Discrete-Time Markov Jump Linear Systems , 2004, IEEE Transactions on Automatic Control.
[25] H. Robbins,et al. A Convergence Theorem for Non Negative Almost Supermartingales and Some Applications , 1985 .
[26] Assaf J. Zeevi,et al. Chasing Demand: Learning and Earning in a Changing Environment , 2016, Math. Oper. Res..
[27] Harry L. Van Trees,et al. Detection, Estimation, and Modulation Theory, Part I , 1968 .
[28] H. Robbins,et al. Adaptive Design and Stochastic Approximation , 1979 .
[29] Björn Wittenmark,et al. On Self Tuning Regulators , 1973 .
[30] Bert Zwart,et al. Simultaneously Learning and Optimizing Using Controlled Variance Pricing , 2014, Manag. Sci..
[31] Ronald J. Balvers,et al. Actively Learning about Demand and the Dynamics of Price Adjustment , 1990 .