暂无分享,去创建一个
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[3] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[4] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..
[5] Raymond J. Mooney,et al. Content-boosted collaborative filtering for improved recommendations , 2002, AAAI/IAAI.
[6] Qiang Yang,et al. One-Class Collaborative Filtering , 2008, 2008 Eighth IEEE International Conference on Data Mining.
[7] Shou-De Lin,et al. A Linear Ensemble of Individual and Blended Models for Music Rating Prediction , 2012, KDD Cup.
[8] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[9] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[10] Yisong Yue,et al. Hierarchical Exploration for Accelerating Contextual Bandits , 2012, ICML.
[11] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[12] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[14] Phillipp Bergmann. Dynamic Programming Deterministic And Stochastic Models , 2016 .
[15] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[16] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.
[17] Zoubin Ghahramani,et al. Probabilistic Matrix Factorization with Non-random Missing Data , 2014, ICML.
[18] Gediminas Adomavicius,et al. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.
[19] Ruslan Salakhutdinov,et al. Probabilistic Matrix Factorization , 2007, NIPS.
[20] Yehuda Koren,et al. Collaborative filtering with temporal dynamics , 2009, KDD.
[21] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[22] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[23] Param Vir Singh,et al. A Hidden Markov Model for Collaborative Filtering , 2010, MIS Q..
[24] Peter I. Frazier,et al. Exploration vs. Exploitation in the Information Filtering Problem , 2014, ArXiv.
[25] James Bennett,et al. The Netflix Prize , 2007 .
[26] Yehuda Koren,et al. The Yahoo! Music Dataset and KDD-Cup '11 , 2012, KDD Cup.
[27] Steffen Rendle,et al. Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.
[28] Richard S. Zemel,et al. Collaborative prediction and ranking with non-random missing data , 2009, RecSys '09.
[29] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[30] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[31] Chih-Jen Lin,et al. A fast parallel SGD for matrix factorization in shared memory systems , 2013, RecSys.