Re-evaluating evaluation
暂无分享,去创建一个
Thore Graepel | Karl Tuyls | David Balduzzi | Julien Pérolat | T. Graepel | K. Tuyls | J. Pérolat | D. Balduzzi
[1] Shane Legg,et al. Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents , 2018, ArXiv.
[2] José Hernández-Orallo,et al. Evaluation in artificial intelligence: from task-oriented to ability-oriented measurement , 2017, Artificial Intelligence Review.
[3] William H. Sandholm,et al. Population Games And Evolutionary Dynamics , 2010, Economic learning and social evolution.
[4] Selmer Bringsjord,et al. Psychometric artificial intelligence , 2011, J. Exp. Theor. Artif. Intell..
[5] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[6] P. Diaconis. Group representations in probability and statistics , 1988 .
[7] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[8] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.
[9] Wm. R. Wright. General Intelligence, Objectively Determined and Measured. , 1905 .
[10] Thomas Hofmann,et al. TrueSkill™: A Bayesian Skill Rating System , 2007 .
[11] Shane Legg,et al. A Universal Measure of Intelligence for Artificial Agents , 2005, IJCAI.
[12] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.
[13] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[14] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[15] Marcus Frean,et al. Rock–scissors–paper and the survival of the weakest , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.
[16] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[17] J. Hernández-Orallo,et al. AI results for the Atari 2600 games : difficulty and discrimination using IRT , 2017 .
[18] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[19] José Hernández-Orallo,et al. An experimental comparison of performance measures for classification , 2009, Pattern Recognit. Lett..
[20] Shane Legg,et al. DeepMind Lab , 2016, ArXiv.
[21] M. Feldman,et al. Local dispersal promotes biodiversity in a real-life game of rock–paper–scissors , 2002, Nature.
[22] Robert A. Laird,et al. Competitive Intransitivity Promotes Species Coexistence , 2006, The American Naturalist.
[23] Michael P. Wellman. Methods for Empirical Game-Theoretic Analysis , 2006, AAAI.
[24] Luis E. Ortiz,et al. Maximum Entropy Correlated Equilibria , 2007, AISTATS.
[25] Tony Jebara,et al. A Kernel Between Sets of Vectors , 2003, ICML.
[26] Robert E. Schapire,et al. Instance-dependent Regret Bounds for Dueling Bandits , 2016, COLT.
[27] Randal S. Olson,et al. PMLB: a large benchmark suite for machine learning evaluation and comparison , 2017, BioData Mining.
[28] I. Kondor,et al. Group theoretical methods in machine learning , 2008 .
[29] Karl Tuyls,et al. Evolutionary Dynamics of Multi-Agent Learning: A Survey , 2015, J. Artif. Intell. Res..
[30] Pushmeet Kohli,et al. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.
[31] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[32] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[33] Alexander J. Smola,et al. Deep Sets , 2017, 1703.06114.
[34] David Silver,et al. Learning values across many orders of magnitude , 2016, NIPS.
[35] Julian Togelius,et al. A comparative evaluation of procedural level generators in the Mario AI framework , 2014, FDG.
[36] Simon M. Lucas,et al. Evolving mario levels in the latent space of a deep convolutional generative adversarial network , 2018, GECCO.
[37] Jan Ramon,et al. An evolutionary game-theoretic analysis of poker strategies , 2009, Entertain. Comput..
[38] Kevin Leyton-Brown,et al. Deep Models of Interactions Across Sets , 2018, ICML.
[39] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[40] Asuman E. Ozdaglar,et al. Near-Potential Games: Geometry and Dynamics , 2013, TEAC.
[41] Asuman E. Ozdaglar,et al. Flows and Decompositions of Games: Harmonic and Potential Games , 2010, Math. Oper. Res..
[42] Yao Zhao,et al. Adversarial Attacks and Defences Competition , 2018, ArXiv.
[43] D. Meyer,et al. Supporting Online Material Materials and Methods Som Text Figs. S1 to S6 References Evidence for a Collective Intelligence Factor in the Performance of Human Groups , 2022 .
[44] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[45] Michael P. Wellman,et al. Empirical game-theoretic analysis of the TAC Supply Chain game , 2007, AAMAS '07.
[46] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[47] Julian Togelius,et al. General Video Game Evaluation Using Relative Algorithm Performance Profiles , 2015, EvoApplications.
[48] Julian Togelius,et al. Towards a Generic Method of Evaluating Game Levels , 2013, AIIDE.
[49] Alex M. Andrew,et al. Boosting: Foundations and Algorithms , 2012 .
[50] Peter A. Flach,et al. A Unified View of Performance Metrics: Translating Threshold Choice into Expected Classification Loss C` Esar Ferri , 2012 .
[51] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.
[52] Elizabeth Sklar,et al. Auctions, Evolution, and Multi-agent Learning , 2007, Adaptive Agents and Multi-Agents Systems.
[53] R. Vandenberg,et al. A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research , 2000 .
[54] Yuan Yao,et al. Statistical ranking and combinatorial Hodge theory , 2008, Math. Program..
[55] Zhen Lin,et al. Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network , 2018, NeurIPS.
[56] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .
[57] R. Hambleton,et al. Fundamentals of Item Response Theory , 1991 .
[58] Shane Legg,et al. An Approximation of the Universal Intelligence Measure , 2011, Algorithmic Probability and Friends.
[59] Peter McBurney,et al. An evolutionary game-theoretic comparison of two double-auction market designs , 2004, AAMAS'04.
[60] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..
[61] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[62] Joel Z. Leibo,et al. A Generalised Method for Empirical Game Theoretic Analysis , 2018, AAMAS.
[63] D. Hunter. MM algorithms for generalized Bradley-Terry models , 2003 .
[64] Julian Togelius,et al. AI-based playtesting of contemporary board games , 2017, FDG.
[65] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[66] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[67] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[68] Tom Minka,et al. TrueSkillTM: A Bayesian Skill Rating System , 2006, NIPS.
[69] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[70] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[71] Dan Boneh,et al. Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.
[72] Michael P. Wellman,et al. Practical Strategic Reasoning with Applications in Market Games , 2010 .
[73] William H. Sandholm,et al. ON THE GLOBAL CONVERGENCE OF STOCHASTIC FICTITIOUS PLAY , 2002 .
[74] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[75] Attila Szolnoki,et al. Cyclic dominance in evolutionary games: a review , 2014, Journal of The Royal Society Interface.
[76] Rajarshi Das,et al. Choosing Samples to Compute Heuristic-Strategy Nash Equilibrium , 2003, AMEC.
[77] José Hernández-Orallo,et al. The Measure of All Minds: Evaluating Natural and Artificial Intelligence , 2017 .
[78] Asuman E. Ozdaglar,et al. Dynamics in near-potential games , 2011, Games Econ. Behav..
[79] D. Donoho. 50 Years of Data Science , 2017 .
[80] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[81] Katja Hofmann,et al. Contextual Dueling Bandits , 2015, COLT.