Infomax Strategies for an Optimal Balance Between Exploration and Exploitation
暂无分享,去创建一个
[1] Akimichi Takemura,et al. An Asymptotically Optimal Bandit Algorithm for Bounded Support Models. , 2010, COLT 2010.
[2] John L. Kelly,et al. A new interpretation of information rate , 1956, IRE Trans. Inf. Theory.
[3] Sreekanth H. Chalasani,et al. Information theory of adaptation in neurons, behavior, and mood , 2014, Current Opinion in Neurobiology.
[4] Ronald A. Howard,et al. Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..
[5] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[6] Jeremy Wyatt,et al. Exploration and inference in learning from reinforcement , 1998 .
[7] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.
[8] Kevin D. Glazebrook,et al. Multi-Armed Bandit Allocation Indices: Gittins/Multi-Armed Bandit Allocation Indices , 2011 .
[9] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[10] T. Lai. Adaptive treatment allocation and the multi-armed bandit problem , 1987 .
[11] Apostolos Burnetas,et al. Optimal Adaptive Policies for Markov Decision Processes , 1997, Math. Oper. Res..
[12] W. Bialek,et al. Information flow and optimization in transcriptional regulation , 2007, Proceedings of the National Academy of Sciences.
[13] Massimo Vergassola,et al. ‘Infotaxis’ as a strategy for searching without gradients , 2007, Nature.
[14] Aleksandra M Walczak,et al. Information transmission in genetic regulatory networks: a review , 2011, Journal of physics. Condensed matter : an Institute of Physics journal.
[15] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[16] Ralph Linsker,et al. Self-organization in a perceptual network , 1988, Computer.
[17] D. Gillespie. Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .
[18] W. Press,et al. Numerical Recipes in C++: The Art of Scientific Computing (2nd edn)1 Numerical Recipes Example Book (C++) (2nd edn)2 Numerical Recipes Multi-Language Code CD ROM with LINUX or UNIX Single-Screen License Revised Version3 , 2003 .
[19] E. Siggia,et al. Predicting embryonic patterning using mutual entropy fitness and in silico evolution , 2010, Development.
[20] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[21] L. Goddard. Information Theory , 1962, Nature.
[22] William Bialek,et al. Spikes: Exploring the Neural Code , 1996 .
[23] W. Bialek. Biophysics: Searching for Principles , 2012 .
[24] T. Toffoli. Physics and computation , 1982 .
[25] Djallel Bouneffouf,et al. Finite-time analysis of the multi-armed bandit problem with known trend , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).
[26] Bruno A. Olshausen,et al. Book Review , 2003, Journal of Cognitive Neuroscience.
[27] Andrew R. Barron,et al. A bound on the financial value of information , 1988, IEEE Trans. Inf. Theory.
[28] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[29] Carl T. Bergstrom,et al. The fitness value of information , 2005, Oikos.
[30] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[31] S. Leibler,et al. Phenotypic Diversity, Population Growth, and Information in Fluctuating Environments , 2005, Science.
[32] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[33] Peter Dayan,et al. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems , 2001 .
[34] I. Nemenman,et al. Information Transduction Capacity of Noisy Biochemical Signaling Networks , 2011, Science.
[35] L. C. Thomas,et al. Optimization over Time. Dynamic Programming and Stochastic Control. Volume 1 , 1983 .
[36] R. Gallager. Information Theory and Reliable Communication , 1968 .
[37] Peter Harremoës,et al. Rényi Divergence and Kullback-Leibler Divergence , 2012, IEEE Transactions on Information Theory.
[38] M. Mézard,et al. Information, Physics, and Computation , 2009 .
[39] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[40] H. B. Barlow,et al. Possible Principles Underlying the Transformations of Sensory Messages , 2012 .
[41] Chris Wiggins,et al. ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.
[42] Stanislas Leibler,et al. The Value of Information for Populations in Varying Environments , 2010, ArXiv.
[43] T. Lai,et al. Optimal stopping and dynamic allocation , 1987, Advances in Applied Probability.
[44] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[45] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[46] Ilya Nemenman,et al. Information theory and adaptation , 2010, 1011.5466.
[47] S. Laughlin. The role of sensory adaptation in the retina. , 1989, The Journal of experimental biology.
[48] Joseph J. Atick,et al. What Does the Retina Know about Natural Scenes? , 1992, Neural Computation.
[49] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[50] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .
[51] Carl T. Bergstrom,et al. Shannon information and biological fitness , 2004, Information Theory Workshop.