Non�?Parametric Bayesian Inference of Strategies in Repeated Games
暂无分享,去创建一个
Joshua B. Tenenbaum | Max Kleiman-Weiner | Penghui Zhou | J. Tenenbaum | Max Kleiman-Weiner | Penghui Zhou
[1] Drew Fudenberg,et al. The Folk Theorem in Repeated Games with Discounting or with Incomplete Information , 1986 .
[2] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[3] M. Nowak,et al. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner's Dilemma game , 1993, Nature.
[4] Frank D. Wood,et al. The sequence memoizer , 2011, Commun. ACM.
[5] Yves Breitmoser. Cooperation, but no reciprocity: Individual strategies in the repeated Prisoner's Dilemma , 2015 .
[6] Edith Elkind,et al. Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems , 2015, AAMAS 2015.
[7] G. Spagnolo,et al. Equilibrium Selection in the Repeated Prisoner's Dilemma: Axiomatic Approach and Experimental Evidence , 2011 .
[8] M. Nowak,et al. Tit for tat in heterogeneous populations , 1992, Nature.
[9] Guillaume Fréchette,et al. The Evolution of Cooperation in Infinitely Repeated Games: Experimental Evidence , 2011 .
[10] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[11] Fernando Pereira,et al. Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..
[12] David Carmel,et al. Learning Models of Intelligent Agents , 1996, AAAI/IAAI, Vol. 1.
[13] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .
[14] Joshua B. Tenenbaum,et al. Nonparametric Bayesian Policy Priors for Reinforcement Learning , 2010, NIPS.
[15] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[16] Carl E. Rasmussen,et al. Factorial Hidden Markov Models , 1997 .
[17] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[18] E. Kalai,et al. Rational Learning Leads to Nash Equilibrium , 1993 .
[19] Peter Stone,et al. A polynomial-time nash equilibrium algorithm for repeated games , 2003, EC '03.
[20] J. Sethuraman. A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .
[21] André Kempe. Finite state transducers approximating Hidden Markov Models , 1997 .
[22] Piotr J. Gmytrasiewicz,et al. Nonparametric Bayesian Learning of Other Agents? Policies in Interactive POMDPs , 2015, AAMAS.
[23] Yee Whye Teh,et al. Beam sampling for the infinite hidden Markov model , 2008, ICML '08.
[24] P. Bó. Cooperation under the Shadow of the Future: Experimental Evidence from Infinitely Repeated Games , 2005 .
[25] A. Rubinstein. Finite automata play the repeated prisoner's dilemma , 1986 .
[26] S. Gabriel,et al. The Structure of Haplotype Blocks in the Human Genome , 2002, Science.
[27] W. Hamilton,et al. The Evolution of Cooperation , 1984 .
[28] Krishnendu Chatterjee,et al. Forgiver Triumphs in Alternating Prisoner's Dilemma , 2013, PloS one.
[29] John C. Trueswell,et al. Proceedings of the 38th Annual Conference of the Cognitive Science Society (pp. 432-437) Cognitive Science Society. , 2016 .
[30] Joshua B. Tenenbaum,et al. Coordinate to cooperate or compete: Abstract goals and joint intentions in social interaction , 2016, CogSci.