论文信息 - Learning deep structured semantic models for web search using clickthrough data

Learning deep structured semantic models for web search using clickthrough data

Latent semantic models, such as LSA, intend to map a query to its relevant documents at the semantic level where keyword-based matching often fails. In this study we strive to develop a series of new latent semantic models with a deep structure that project queries and documents into a common low-dimensional space where the relevance of a document given a query is readily computed as the distance between them. The proposed deep structured semantic models are discriminatively trained by maximizing the conditional likelihood of the clicked documents given a query using the clickthrough data. To make our models applicable to large-scale Web search applications, we also use a technique called word hashing, which is shown to effectively scale up our semantic models to handle large vocabularies which are common in such tasks. The new models are evaluated on a Web document ranking task using a real-world data set. Results show that our best model significantly outperforms other latent semantic models, which were considered state-of-the-art in the performance prior to the work presented in this paper.

[1] Richard A. Harshman,et al. Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[2] Mitch Weintraub,et al. NONLINEAR DISCRIMINANT FEATURE EXTRACTION FOR ROBUST TEXT-INDEPENDENT SPEAKER RECOGNITION , 1997 .

[3] Thomas Hofmann,et al. Probabilistic latent semantic indexing , 1999, SIGIR '99.

[4] Jaana Kekäläinen,et al. IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[5] Larry P. Heck,et al. Robustness to telephone handset distortion in speaker recognition by discriminative feature design , 2000, Speech Commun..

[6] J. van Leeuwen,et al. Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[7] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8] Gregory N. Hullender,et al. Learning to rank using gradient descent , 2005, ICML.

[9] Susan T. Dumais,et al. Automatic cross-linguistic information retrieval using latent semantic indexing , 2007 .

[10] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[11] Wu Chou,et al. Discriminative learning in sequential pattern recognition , 2008, IEEE Signal Processing Magazine.

[12] Christopher J. C. Burges,et al. A machine learning approach for improved BM25 retrieval , 2009, CIKM.

[13] Wei Yuan,et al. Smoothing clickthrough data for web search ranking , 2009, SIGIR.

[14] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..

[15] John C. Platt,et al. Translingual Document Representations from Discriminative Projections , 2010, EMNLP.

[16] Jianfeng Gao,et al. Clickthrough-based translation models for web search: from word models to phrase models , 2010, CIKM.

[17] Jianfeng Gao,et al. Clickthrough-based latent semantic models for web search , 2011, SIGIR.

[18] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[19] John C. Platt,et al. Learning Discriminative Projections for Text Similarity Measures , 2011, CoNLL.

[20] Andrew Y. Ng,et al. Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[21] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[22] Gökhan Tür,et al. Towards deeper understanding: Deep convex networks for semantic utterance classification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[24] Dong Yu,et al. Tensor Deep Stacking Networks , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25] Yoshua Bengio,et al. Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[26] Jianfeng Gao,et al. Deep stacking networks for information retrieval , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.