暂无分享,去创建一个
[1] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[2] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.
[3] G. Monahan. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .
[4] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[5] Nick Montfort,et al. Twisty Little Passages: An Approach to Interactive Fiction , 2003 .
[6] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[7] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[9] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[10] Rudolf Kadlec,et al. Embracing data abundance: BookTest Dataset for Reading Comprehension , 2016, ICLR.
[11] Regina Barzilay,et al. Language Understanding for Text-based Games using Deep Reinforcement Learning , 2015, EMNLP.
[12] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[13] Rabab Kreidieh Ward,et al. Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[15] Anna-Lan Huang,et al. Similarity Measures for Text Document Clustering , 2008 .
[16] Regina Barzilay,et al. Learning to Win by Reading Manuals in a Monte-Carlo Framework , 2011, ACL.
[17] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.
[18] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[19] Geoffrey Zweig,et al. Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.
[20] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[21] James H. Martin,et al. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .
[22] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[23] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[24] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.
[25] Yann LeCun,et al. Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..
[26] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[27] Jianfeng Gao,et al. Deep Reinforcement Learning with an Unbounded Action Space , 2015, ArXiv.