Stochastic dilemmas: foundations and applications
暂无分享,去创建一个
[1] L. Addario-Berry,et al. Ballot Theorems, Old and New , 2008 .
[2] William Feller,et al. An Introduction to Probability Theory and Its Applications , 1967 .
[3] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .
[4] Panagiotis G. Ipeirotis,et al. Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.
[5] Inc. Alias-i. Multilevel Bayesian Models of Categorical Data Annotation , 2008 .
[6] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[7] David H. Ackley,et al. The effects of selection on noisy fitness optimization , 2011, GECCO '11.
[8] Csaba Szepesvári,et al. Empirical Bernstein stopping , 2008, ICML '08.
[9] András Lörincz,et al. Learning Tetris Using the Noisy Cross-Entropy Method , 2006, Neural Computation.
[10] H. D. Miller. Combinatorial methods in the theory of stochastic processes , 1968, Comput. J..
[11] Julian Togelius,et al. The 2009 Mario AI Competition , 2010, IEEE Congress on Evolutionary Computation.
[12] O. Kallenberg. Ballot theorems and Sojourn laws for stationary processes , 1999 .
[13] Michael D. Vose,et al. The simple genetic algorithm - foundations and theory , 1999, Complex adaptive systems.
[14] R. Rubinstein. The Cross-Entropy Method for Combinatorial and Continuous Optimization , 1999 .
[15] Rémi Munos,et al. Open Loop Optimistic Planning , 2010, COLT.
[16] Balázs Kégl,et al. Surrogating the surrogate: accelerating Gaussian-process-based global optimization with a mixture cross-entropy algorithm , 2010, ICML.
[17] Javier R. Movellan,et al. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.
[18] Steven I. Marcus,et al. Simulation-based Algorithms for Markov Decision Processes/ Hyeong Soo Chang ... [et al.] , 2013 .
[19] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.
[20] Anne Auger,et al. Real-Parameter Black-Box Optimization Benchmarking 2009: Noiseless Functions Definitions , 2009 .
[21] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.
[22] Shimon Whiteson,et al. The Reinforcement Learning Competitions , 2010 .
[23] Dirk P. Kroese,et al. Application of the Cross-Entropy Method to the Buffer Allocation Problem in a Simulation-Based Environment , 2005, Ann. Oper. Res..
[24] Eric Horvitz,et al. Combining human and machine intelligence in large-scale crowdsourcing , 2012, AAMAS.
[25] Michael L. Littman,et al. The Cross-Entropy Method Optimizes for Quantiles , 2013, ICML.
[26] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .
[27] Andrew W. Moore,et al. The Racing Algorithm: Model Selection for Lazy Learners , 1997, Artificial Intelligence Review.
[28] Christian Igel,et al. Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search , 2009, ICML '09.
[29] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[30] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[31] Panagiotis G. Ipeirotis,et al. Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.
[32] Shie Mannor,et al. The cross entropy method for classification , 2005, ICML.
[33] J. Fitzpatrick,et al. Genetic Algorithms in Noisy Environments , 2005, Machine Learning.
[34] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[35] C. Lintott,et al. Galaxy Zoo 2: detailed morphological classifications for 304,122 galaxies from the Sloan Digital Sky Survey , 2013, 1308.3496.
[36] A. P. Dawid,et al. Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .
[37] Umesh V. Vazirani,et al. An Introduction to Computational Learning Theory , 1994 .
[38] Jaime G. Carbonell,et al. Proactive learning: cost-sensitive active learning with multiple imperfect oracles , 2008, CIKM '08.
[39] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[40] Rémi Munos,et al. Algorithms for Infinitely Many-Armed Bandits , 2008, NIPS.
[41] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[42] Michael I. Jordan,et al. Bayesian Bias Mitigation for Crowdsourcing , 2011, NIPS.
[43] Ran Canetti,et al. Lower Bounds for Sampling Algorithms for Estimating the Average , 1995, Inf. Process. Lett..
[44] Gerardo Hermosillo,et al. Learning From Crowds , 2010, J. Mach. Learn. Res..
[45] Peng Dai,et al. Artificial Intelligence for Artificial Artificial Intelligence , 2011, AAAI.
[46] Michael L. Littman,et al. Planning in Reward-Rich Domains via PAC Bandits , 2012, EWRL.
[47] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[48] Lih-Yuan Deng,et al. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning , 2006, Technometrics.
[49] H. Robbins,et al. Iterated logarithm inequalities. , 1967, Proceedings of the National Academy of Sciences of the United States of America.
[50] Peter Stone,et al. An empirical analysis of value function-based and policy search reinforcement learning , 2009, AAMAS.
[51] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[52] Shie Mannor,et al. The Cross Entropy Method for Fast Policy Search , 2003, ICML.
[53] H. Robbins,et al. Inequalities for the sequence of sample means. , 1967, Proceedings of the National Academy of Sciences of the United States of America.
[54] Dirk P. Kroese,et al. Convergence properties of the cross-entropy method for discrete optimization , 2007, Oper. Res. Lett..
[55] Shai Ben-David,et al. Learning Distributions by Their Density Levels: A Paradigm for Learning without a Teacher , 1997, J. Comput. Syst. Sci..
[56] S. Ioffe,et al. Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming , 1996 .
[57] Hans Ulrich Simon,et al. General bounds on the number of examples needed for learning probabilistic concepts , 1993, COLT '93.
[58] David E. Goldberg,et al. Genetic Algorithms, Tournament Selection, and the Effects of Noise , 1995, Complex Syst..
[59] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[60] Brendan T. O'Connor,et al. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.
[61] Pietro Perona,et al. Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.
[62] H. Robbins. Statistical Methods Related to the Law of the Iterated Logarithm , 1970 .
[63] Jaime G. Carbonell,et al. Efficiently learning the accuracy of labeling sources for selective sampling , 2009, KDD.
[64] Mark W. Schmidt,et al. Modeling annotator expertise: Learning when everybody knows a bit of something , 2010, AISTATS.
[65] Dimitri P. Bertsekas,et al. Temporal Dierences-Based Policy Iteration and Applications in Neuro-Dynamic Programming 1 , 1997 .
[66] J. Carbonell,et al. Adaptive Proactive Learning with Cost-Reliability Tradeoff , 2009 .
[67] Marin Kobilarov,et al. Cross-Entropy Randomized Motion Planning , 2011, Robotics: Science and Systems.
[68] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[69] Olivier Sigaud,et al. Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.
[70] Reuven Y. Rubinstein,et al. Optimization of computer simulation models with rare events , 1997 .
[71] L. Margolin,et al. On the Convergence of the Cross-Entropy Method , 2005, Ann. Oper. Res..