A survey of types of text noise and techniques to handle noisy text
暂无分享,去创建一个
Shourya Roy | L. Venkata Subramaniam | Tanveer A. Faruquie | Sumit Negi | L. V. Subramaniam | Shourya Roy | T. Faruquie | Sumit Negi
[1] T. J. Watson. Summarizing Noisy Documents Hongyan Jing Daniel Lopresti Chilin Shih IBM , 2003 .
[2] Lina Zhou,et al. Error Detection Using Linguistic Features , 2005, HLT/EMNLP.
[3] Kazem Taghva,et al. Effects of OCR Errors on Ranking and Feedback Using the Vector Space Model , 1996, Inf. Process. Manag..
[4] Yuji Matsumoto,et al. Automatic Construction of Machine Translation Knowledge Using Translation Literalness , 2003, EACL.
[5] Mari Ostendorf,et al. Improving Information Extraction by Modeling Errors in Speech Recognizer Output , 2001, HLT.
[6] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[7] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[8] L. Venkata Subramaniam,et al. SMS based Interface for FAQ Retrieval , 2009, ACL.
[9] Ulrich Kressel,et al. Categorizing Paper Documents: A Generic System for Domain and Language Independent Text Categorization , 1998, Comput. Vis. Image Underst..
[10] Eric Brill,et al. Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users , 2004, EMNLP.
[11] Peter Boros,et al. Query Segmentation for Web Search , 2003, WWW.
[12] Shourya Roy,et al. How Much Noise Is Too Much: A Study in Automatic Text Classification , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).
[13] François Yvon,et al. Normalizing SMS: are Two Metaphors Better than One ? , 2008, COLING.
[14] Eric Horvitz,et al. Patterns of search: analyzing and modeling Web query refinement , 1999 .
[15] George R. Doddington,et al. Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .
[16] Kenneth Ward Church,et al. Query suggestion using hitting time , 2008, CIKM '08.
[17] Ming Zhou,et al. Improving Query Spelling Correction Using Web Search Results , 2007, EMNLP-CoNLL.
[18] Karen Kukich,et al. Techniques for automatically correcting words in text , 1992, CSUR.
[19] Richard M. Schwartz,et al. Named Entity Extraction from Noisy Input: Speech and OCR , 2000, ANLP.
[20] Shourya Roy,et al. Automatic Generation of Domain Models for Call-Centers from Noisy Transcriptions , 2006, ACL.
[21] Hermann Ney,et al. Automatic Filtering of Bilingual Corpora for Statistical Machine Translation , 2005, NLDB.
[22] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.
[23] Farooq Ahmad,et al. Learning a Spelling Error Model from Search Query Logs , 2005, HLT.
[24] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.
[25] Eiichiro Sumita,et al. Bilingual corpus cleaning focusing on translation literality , 2002, INTERSPEECH.
[26] Alessandro Vinciarelli,et al. Noisy text categorization , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[27] Satoshi Takahashi,et al. Rejection of out-of-vocabulary words using phoneme confidence likelihood , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[28] Hang Li,et al. A unified and discriminative model for query refinement , 2008, SIGIR '08.
[29] L. Venkata Subramaniam,et al. Business Intelligence from Voice of Customer , 2009, 2009 IEEE 25th International Conference on Data Engineering.
[30] C. Uhrik,et al. Confidence metrics based on n-gram language model backoff behaviors , 1997, EUROSPEECH.
[31] Animesh Mukherjee,et al. Investigation and modeling of the structure of texting language , 2007, International Journal of Document Analysis and Recognition (IJDAR).
[32] Thomas Schaaf,et al. Estimating confidence using word lattices , 1997, EUROSPEECH.
[33] Yang Zhang,et al. Exploring Distributional Similarity Based Models for Query Spelling Correction , 2006, ACL.
[34] Daniel P. Lopresti,et al. Optical character recognition errors and their effects on natural language processing , 2008, AND '08.
[35] Gökhan Tür,et al. Optimizing SVMs for complex call classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[36] Ganesh Ramakrishnan,et al. Identification of class specific discourse patterns , 2008, CIKM '08.