Towards optimization of a human-inspired heuristic for solving explore-exploit problems
暂无分享,去创建一个
Naomi Ehrich Leonard | Philip Holmes | Paul B. Reverdy | Robert C. Wilson | Paul Reverdy | Robert C. Wilson | P. Holmes
[1] Han-Lim Choi,et al. A multi-UAV targeting algorithm for ensemble forecast improvement , 2007 .
[2] Warren B. Powell,et al. “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.
[3] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .
[4] Andrea Nedic. Models for Individual Decision-Making with Social Feedback , 2011 .
[5] Han-Lim Choi,et al. Adaptive sampling and forecasting with mobile sensor networks , 2009 .
[6] Sailes K. Sengijpta. Fundamentals of Statistical Signal Processing: Estimation Theory , 1995 .
[7] Andrew M. Saxe,et al. Acquisition of decision making criteria: reward rate ultimately beats accuracy , 2011, Attention, perception & psychophysics.
[8] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[9] Warren B. Powell,et al. Approximate Dynamic Programming I: Modeling , 2011 .
[10] Jonathan D. Cohen,et al. An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. , 2005, Annual review of neuroscience.
[12] Angela J. Yu,et al. Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.
[13] H. Vincent Poor,et al. An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.
[14] H. Vincent Poor,et al. An introduction to signal detection and estimation (2nd ed.) , 1994 .
[15] Naomi Ehrich Leonard,et al. Collective Motion, Sensor Networks, and Ocean Sampling , 2007, Proceedings of the IEEE.