论文信息 - learning algorithms for changing environments

learning algorithms for changing environments

We study online learning in an oblivious changing environment. The standard measure of regret bounds the difference between the cost of the online learner and the best decision in hindsight. Hence, regret minimizing algorithms tend to converge to the static best optimum, clearly a suboptimal behavior in changing environments. On the other hand, various metrics proposed to strengthen regret and allow for more dynamic algorithms produce inecient algorithms.

Elad Hazan | C. Seshadhri | Elad Hazan | C. Seshadhri

[1] Ehud Lehrer,et al. A wide range no-regret theorem , 2003, Games Econ. Behav..

[2] Andrew C. Singer,et al. Universal Constant Rebalanced Portfolios with Switching , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[3] Mark Herbster,et al. Tracking the Best Expert , 1995, Machine Learning.

[4] Robert Krauthgamer,et al. Estimating the sortedness of a data stream , 2007, SODA '07.

[5] Robert E. Schapire,et al. Algorithms for portfolio management based on the Newton method , 2006, ICML.

[6] Adam Tauman Kalai,et al. Logarithmic Regret Algorithms for Online Convex Optimization , 2006, COLT.

[7] Yishay Mansour,et al. From External to Internal Regret , 2005, J. Mach. Learn. Res..

[8] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[9] Manfred K. Warmuth,et al. Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..

[10] Seshadhri Comandur,et al. Electronic Colloquium on Computational Complexity, Report No. 88 (2007) Adaptive Algorithms for Online Decision Problems , 2022 .

[11] David J. Goodman,et al. Personal Communications , 1994, Mobile Communications.

[12] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[13] Yoram Singer,et al. Using and combining predictors that specialize , 1997, STOC '97.