论文信息 - End-to-end Learning for Short Text Expansion - 字舞流文

End-to-end Learning for Short Text Expansion

Effectively making sense of short texts is a critical task for many real world applications such as search engines, social media services, and recommender systems. The task is particularly challenging as a short text contains very sparse information, often too sparse for a machine learning algorithm to pick up useful signals. A common practice for analyzing short text is to first expand it with external information, which is usually harvested from a large collection of longer texts. In literature, short text expansion has been done with all kinds of heuristics. We propose an end-to-end solution that automatically learns how to expand short text to optimize a given learning task. A novel deep memory network is proposed to automatically find relevant information from a collection of longer documents and reformulate the short text through a gating mechanism. Using short text classification as a demonstrating task, we show that the deep memory network significantly outperforms classical text expansion methods with comprehensive experiments on real world data sets.

Yue Wang | Kai Zheng | Jian Tang | Qiaozhu Mei | Q. Mei | Jian Tang | Yue Wang | Kai Zheng | Qiaozhu Mei

[1] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[2] Alex Graves,et al. Supervised Sequence Labelling , 2012 .

[3] Rabab Kreidieh Ward,et al. Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[4] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[5] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[6] John D. Lafferty,et al. A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[7] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8] Katrina Fenlon,et al. Improving retrieval of short texts through document expansion , 2012, SIGIR '12.

[9] Susumu Horiguchi,et al. Learning to classify short and sparse text & web with hidden topics from large-scale data collections , 2008, WWW.

[10] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[11] Qiaozhu Mei,et al. PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[12] Richard Socher,et al. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[13] J. J. Rocchio,et al. Relevance feedback in information retrieval , 1971 .

[14] Craig MacDonald,et al. Expertise drift and query expansion in expert search , 2007, CIKM '07.

[15] Nan Sun,et al. Exploiting internal and external semantics for the clustering of short texts using world knowledge , 2009, CIKM.

[16] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[17] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.

[18] Mingzhe Wang,et al. LINE: Large-scale Information Network Embedding , 2015, WWW.

[19] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[20] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[21] Michele Merler,et al. Recognizing Groceries in situ Using in vitro Training Data , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22] W. Bruce Croft,et al. Analysis of long queries in a large scale search log , 2009, WSCD '09.

[23] John D. Lafferty,et al. Model-based feedback in the language modeling approach to information retrieval , 2001, CIKM '01.

[24] Ophir Frieder,et al. Improving relevance feedback in the vector space model , 1997, CIKM '97.

[25] Jimmy J. Lin,et al. Web question answering: is more always better? , 2002, SIGIR '02.

[26] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[27] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.

[28] Tomas Mikolov,et al. Bag of Tricks for Efficient Text Classification , 2016, EACL.

[29] Evgeniy Gabrilovich,et al. Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[30] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[31] Hong Yu,et al. Neural Semantic Encoders , 2016, EACL.

[32] James Allan,et al. Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[33] Jennifer Chu-Carroll,et al. Statistical source expansion for question answering , 2011, CIKM '11.

[34] Mehran Sahami,et al. A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[35] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.