暂无分享,去创建一个
[1] Lucy J. Colwell,et al. Rethinking Attention with Performers , 2020, ICLR.
[2] Feng Xia,et al. Scientific Paper Recommendation: A Survey , 2020, IEEE Access.
[3] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[4] Daniel S. Weld,et al. TLDR: Extreme Summarization of Scientific Documents , 2020, FINDINGS.
[5] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[6] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[7] Jiuyong Li,et al. A Semantics Aware Random Forest for Text Classification , 2019, CIKM.
[8] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[9] Meghan Hupe. EndNote X9 , 2019, Journal of Electronic Resources in Medical Libraries.
[10] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[11] Yuval Pinter,et al. Attention is not not Explanation , 2019, EMNLP.
[12] Roger Wattenhofer,et al. On the Validity of Self-Attention as Explanation in Transformer Models , 2019, ArXiv.
[13] Omer Levy,et al. What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.
[14] Alexander A. Alemi,et al. On the Use of ArXiv as a Dataset , 2019, ArXiv.
[15] Byron C. Wallace,et al. Attention is not Explanation , 2019, NAACL.
[16] Lei Wang,et al. CRAN: A Hybrid CNN-RNN Attention-Based Model for Text Classification , 2018, ER.
[17] Tie-Yan Liu,et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.
[18] Xia Feng,et al. Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey , 2017, Multimedia Tools and Applications.
[19] Franck Dernoncourt,et al. PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts , 2017, IJCNLP.
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Matthijs Douze,et al. FastText.zip: Compressing text classification models , 2016, ArXiv.
[22] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[23] Tianqi Chen,et al. XGBoost: A Scalable Tree Boosting System , 2016, KDD.
[24] Wojciech Zaremba,et al. An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.
[25] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[26] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[27] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[28] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[29] Yunming Ye,et al. An Improved Random Forest Classifier for Text Categorization , 2012, J. Comput..
[30] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[31] Richard E. West,et al. Mendeley: Creating Communities of Scholarly Inquiry Through Research Collaboration , 2011 .
[32] David D. Lewis,et al. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.
[33] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[34] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[35] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.
[36] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[37] Meng Zhang,et al. Neural Network Methods for Natural Language Processing , 2017, Computational Linguistics.
[38] Vangelis Metsis,et al. Spam Filtering with Naive Bayes - Which Naive Bayes? , 2006, CEAS.
[39] Peter Wiemer-Hastings,et al. Latent semantic analysis , 2004, Annu. Rev. Inf. Sci. Technol..
[40] Michael I. Jordan. Serial Order: A Parallel Distributed Processing Approach , 1997 .