Collective Noise Contrastive Estimation for Policy Transfer Learning
暂无分享,去创建一个
[1] Wei Chu,et al. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.
[2] Lihong Li,et al. Learning from Logged Implicit Exploration Data , 2010, NIPS.
[3] Lihong Li,et al. Offline Evaluation and Optimization for Interactive Systems , 2015, WSDM.
[4] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[5] Lihong Li,et al. Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study , 2015, WWW.
[6] Aapo Hyvärinen,et al. Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..
[7] Kurt Driessens,et al. Transfer Learning in Reinforcement Learning Problems Through Partial Policy Recycling , 2007, ECML.
[8] Thorsten Joachims,et al. Playlist prediction via metric embedding , 2012, KDD.
[9] Jun Wang,et al. Interactive collaborative filtering , 2013, CIKM.
[10] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[11] Kristian J. Hammond,et al. Flytrap: intelligent group music recommendation , 2002, IUI '02.
[12] Peter Stone,et al. Cross-domain transfer for reinforcement learning , 2007, ICML '07.
[13] Yong Yu,et al. SVDFeature: a toolkit for feature-based collaborative filtering , 2012, J. Mach. Learn. Res..
[14] Yee Whye Teh,et al. A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.
[15] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[16] W. Bruce Croft,et al. Relevance-Based Language Models , 2001, SIGIR '01.
[17] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[18] Ruslan Salakhutdinov,et al. Probabilistic Matrix Factorization , 2007, NIPS.
[19] John Langford,et al. Sample-efficient Nonstationary Policy Evaluation for Contextual Bandits , 2012, UAI.
[20] Blockin Blockin,et al. Quick Training of Probabilistic Neural Nets by Importance Sampling , 2003 .
[21] Peter Stone,et al. DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation , 2014, AAMAS.
[22] John Langford,et al. Exploration scavenging , 2008, ICML '08.
[23] Yehuda Koren,et al. Matrix Factorization Techniques for Recommender Systems , 2009, Computer.