Paperswithtopic: Topic Identification from Paper Title Only

[1]  Lucy J. Colwell,et al.  Rethinking Attention with Performers , 2020, ICLR.

[2]  Feng Xia,et al.  Scientific Paper Recommendation: A Survey , 2020, IEEE Access.

[3]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[4]  Daniel S. Weld,et al.  TLDR: Extreme Summarization of Scientific Documents , 2020, FINDINGS.

[5]  Quoc V. Le,et al.  ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.

[6]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[7]  Jiuyong Li,et al.  A Semantics Aware Random Forest for Text Classification , 2019, CIKM.

[8]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[9]  Meghan Hupe EndNote X9 , 2019, Journal of Electronic Resources in Medical Libraries.

[10]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[11]  Yuval Pinter,et al.  Attention is not not Explanation , 2019, EMNLP.

[12]  Roger Wattenhofer,et al.  On the Validity of Self-Attention as Explanation in Transformer Models , 2019, ArXiv.

[13]  Omer Levy,et al.  What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.

[14]  Alexander A. Alemi,et al.  On the Use of ArXiv as a Dataset , 2019, ArXiv.

[15]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[16]  Lei Wang,et al.  CRAN: A Hybrid CNN-RNN Attention-Based Model for Text Classification , 2018, ER.

[17]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[18]  Xia Feng,et al.  Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey , 2017, Multimedia Tools and Applications.

[19]  Franck Dernoncourt,et al.  PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts , 2017, IJCNLP.

[20]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[21]  Matthijs Douze,et al.  FastText.zip: Compressing text classification models , 2016, ArXiv.

[22]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[24]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[25]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[26]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[27]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  Yunming Ye,et al.  An Improved Random Forest Classifier for Text Categorization , 2012, J. Comput..

[30]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[31]  Richard E. West,et al.  Mendeley: Creating Communities of Scholarly Inquiry Through Research Collaboration , 2011 .

[32]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[33]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[34]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[35]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[36]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[37]  Meng Zhang,et al.  Neural Network Methods for Natural Language Processing , 2017, Computational Linguistics.

[38]  Vangelis Metsis,et al.  Spam Filtering with Naive Bayes - Which Naive Bayes? , 2006, CEAS.

[39]  Peter Wiemer-Hastings,et al.  Latent semantic analysis , 2004, Annu. Rev. Inf. Sci. Technol..

[40]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .