A reinforcement learning approach to personalized learning recommendation systems
暂无分享,去创建一个
Jingchen Liu | Zhiliang Ying | Yunxiao Chen | Xiaoou Li | Xueying Tang | Z. Ying | Yunxiao Chen | Xiaoou Li | Jingchen Liu | Xueying Tang
[1] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[2] H. Robbins. A Stochastic Approximation Method , 1951 .
[3] Longxin Lin. Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.
[4] Stochastic Approximation Methods for Latent Regression Item Response Models , 2010 .
[5] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[6] Shalabh Bhatnagar,et al. Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.
[7] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[8] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[9] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[10] B. Junker,et al. Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory , 2001 .
[11] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.
[12] Susan Murphy,et al. Inference for non-regular parameters in optimal dynamic treatment regimes , 2010, Statistical methods in medical research.
[13] B. Skinner,et al. The Behavior of Organisms: An Experimental Analysis , 2016 .
[14] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[15] Pieter Abbeel,et al. Accelerated Methods for Deep Reinforcement Learning , 2018, ArXiv.
[16] Hua-Hua Chang,et al. From smart testing to smart learning: how testing technology can assist the new generation of education , 2016 .
[17] K. Marti. Stochastic Optimization Methods , 2005 .
[18] M. Reckase. Multidimensional Item Response Theory , 2009 .
[19] B. Junker,et al. Cognitive Assessment Models with Few Assumptions , and Connections with Nonparametric IRT , 2001 .
[20] Li Cai,et al. HIGH-DIMENSIONAL EXPLORATORY ITEM FACTOR ANALYSIS BY A METROPOLIS–HASTINGS ROBBINS–MONRO ALGORITHM , 2010 .
[21] Walter L. Leite,et al. Assessing Change in Latent Skills Across Time With Longitudinal Cognitive Diagnosis Modeling: An Evaluation of Model Performance , 2017, Educational and psychological measurement.
[22] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[23] A. Cohen,et al. A Latent Transition Analysis Model for Assessing Change in Cognitive Skills , 2016, Educational and psychological measurement.
[24] Junhui Wang,et al. A Group-Specific Recommender System , 2017 .
[25] K. VanLehn. The Relative Effectiveness of Human Tutoring, Intelligent Tutoring Systems, and Other Tutoring Systems , 2011 .
[26] Yan Yang,et al. Tracking Skill Acquisition With Cognitive Diagnosis Models: A Higher-Order, Hidden Markov Model With Covariates , 2018 .
[27] Kevin D. Glazebrook,et al. Multi-Armed Bandit Allocation Indices: Gittins/Multi-Armed Bandit Allocation Indices , 2011 .
[28] Ding Wang,et al. Feature selection and feature learning for high-dimensional batch reinforcement learning: A survey , 2015, International Journal of Automation and Computing.
[29] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .
[30] Sean P. Meyn,et al. An analysis of reinforcement learning with function approximation , 2008, ICML '08.
[31] Yuhong Yang,et al. RANDOMIZED ALLOCATION WITH NONPARAMETRIC ESTIMATION FOR A MULTI-ARMED BANDIT PROBLEM WITH COVARIATES , 2002 .
[32] Matthias von Davier,et al. A general diagnostic model applied to language testing data. , 2008, The British journal of mathematical and statistical psychology.
[33] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[34] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[35] Jingchen Liu,et al. Recommendation System for Adaptive Learning , 2018, Applied psychological measurement.
[36] Francisco S. Melo,et al. Q -Learning with Linear Function Approximation , 2007, COLT.
[37] B. Chakraborty,et al. Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine , 2013 .