CloudSpeller: query spelling correction by using a unified hidden markov model with web-scale resources
暂无分享,去创建一个
Query spelling correction is an important component of modern search engines that can help users to express an information need more accurately and thus improve search quality. In this work we proposed and implemented an end-to-end speller correction system, namely CloudSpeller. The CloudSpeller system uses a Hidden Markov Model to effectively model major types of spelling errors in a unified framework, in which we integrate a large-scale lexicon constructed using Wikipedia, an error model trained from high confidence correction pairs, and the Microsoft Web N-gram service. Our system achieves excellent performance on two search query spelling correction datasets, reaching 0.960 and 0.937 F1 scores on the TREC dataset and the MSN dataset respectively.
[1] Xu Sun,et al. A Large Scale Ranker-Based System for Search Query Spelling Correction , 2010, COLING.
[2] Eric Brill,et al. An Improved Error Model for Noisy Channel Spelling Correction , 2000, ACL.
[3] Ming Zhou,et al. Improving Query Spelling Correction Using Web Search Results , 2007, EMNLP-CoNLL.
[4] Eric Brill,et al. Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users , 2004, EMNLP.