暂无分享,去创建一个
Wang Ling | Chris Dyer | Phil Blunsom | Dani Yogatama | Dani Yogatama | Chris Dyer | P. Blunsom | Wang Ling | Phil Blunsom
[1] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[2] Tomas Mikolov,et al. Bag of Tricks for Efficient Text Classification , 2016, EACL.
[3] Hermann Ney,et al. On the Probabilistic Interpretation of Neural Network Classifiers and Discriminative Training Criteria , 1995, IEEE Trans. Pattern Anal. Mach. Intell..
[4] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.
[5] Trevor J. Hastie,et al. Discriminative vs Informative Learning , 1997, KDD.
[6] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[7] Yann LeCun,et al. Very Deep Convolutional Networks for Text Classification , 2016, EACL.
[8] Michael I. Jordan,et al. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.
[9] Kyunghyun Cho,et al. Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers , 2016, ArXiv.
[10] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[11] Yee Whye Teh,et al. A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.
[12] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[13] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.
[14] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[15] OctoMiao. Overcoming catastrophic forgetting in neural networks , 2016 .
[16] Chrisantha Fernando,et al. PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.
[17] Michalis K. Titsias,et al. One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities , 2016, NIPS.
[18] Michalis Titsias Rc Aueb. One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities , 2016, NIPS 2016.
[19] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[20] Dale Schuurmans,et al. Combining Naive Bayes and n-Gram Language Models for Text Classification , 2003, ECIR.