Web scale NLP: a case study on url word breaking
暂无分享,去创建一个
[1] Xiaolong Li,et al. An Overview of Microsoft Web N-gram Corpus and Applications , 2010, NAACL.
[2] Michele Banko,et al. Mitigating the Paucity-of-Data Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing , 2001, HLT.
[3] Jianfeng Gao,et al. Exploring web scale language models for search query processing , 2010, WWW '10.
[4] Ralf D. Brown. Corpus-driven splitting of compound words. , 2002, TMI.
[5] Philipp Koehn,et al. Empirical Methods for Compound Splitting , 2003, EACL.
[6] Min-Yen Kan,et al. Fast webpage classification using URL features , 2005, CIKM '05.
[7] Anand Venkataraman,et al. A Statistical Model for Word Discovery in Transcribed Speech , 2001, CL.
[8] Stephen E. Robertson,et al. Relevance weighting for query independent evidence , 2005, SIGIR '05.
[9] Andrew Lim,et al. Word segmentation and recognition for web document framework , 1999, CIKM '99.
[10] Maarten de Rijke,et al. Shallow Morphological Analysis in Monolingual Information Retrieval for Dutch, German, and Italian , 2001, CLEF.
[11] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[12] Martha Larson,et al. Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches , 2000, INTERSPEECH.
[13] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .
[14] Franco Salvetti,et al. Weblog Classification for Fast Splog Filtering: A URL Language Model Segmentation Approach , 2006, NAACL.
[15] Haifeng Wang,et al. Discriminative Pruning of Language Models for Chinese Word Segmentation , 2006, ACL.
[16] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.
[17] Sanjeet Khaitan,et al. Data-driven compound splitting method for english compounds in domain names , 2009, CIKM.
[18] Rutger van Haasteren,et al. Gibbs Sampling , 2010, Encyclopedia of Machine Learning.
[19] Michael R. Brent,et al. An Efficient, Probabilistically Sound Algorithm for Segmentation and Word Discovery , 1999, Machine Learning.
[20] Chris Brockett,et al. Using a Broad-Coverage Parser for Word-Breaking in Japanese , 2000, COLING.
[21] Kuansan Wang,et al. PSkip: estimating relevance ranking quality from web search clickthrough data , 2009, KDD.
[22] Jianfeng Gao,et al. Multi-style language model for web scale information retrieval , 2010, SIGIR '10.
[23] Thomas L. Griffiths,et al. Contextual Dependencies in Unsupervised Word Segmentation , 2006, ACL.
[24] Wei-Ying Ma,et al. Exploring URL Hit Priors for Web Search , 2006, ECIR.
[25] Enrique Alfonseca,et al. Decompounding query keywords from compounding languages , 2008, ACL.
[26] A. Gelfand,et al. Identifiability, Improper Priors, and Gibbs Sampling for Generalized Linear Models , 1999 .