An Improved Graph Model for Chinese Spell Checking

In this paper, we propose an improved graph model for Chinese spell checking. The model is based on a graph model for generic errors and two independentlytrained models for specific errors. First, a graph model represents a Chinese sentence and a modified single source shortest path algorithm is performed on the graph to detect and correct generic spelling errors. Then, we utilize conditional

[1]  Baobao Chang,et al.  A Maximum Entropy Approach to Chinese Spelling Check , 2013, SIGHAN@IJCNLP.

[2]  Yu He,et al.  Description of HLJU Chinese Spelling Checker for SIGHAN Bakeoff 2013 , 2013, SIGHAN@IJCNLP.

[3]  Jui-Feng Yeh,et al.  Chinese Word Spelling Correction Based on N-gram Ranked Inverted Index List , 2013, SIGHAN@IJCNLP.

[4]  Chao-Lin Liu,et al.  Visually and Phonologically Similar Characters in Incorrect Simplified Chinese Words , 2010, COLING.

[5]  Hai Zhao,et al.  Integrating unsupervised and supervised word segmentation: The role of goodness measures , 2011, Inf. Sci..

[6]  Eric Lecolinet,et al.  A Survey of Methods and Strategies in Character Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Hai Zhao,et al.  Graph Model for Chinese Spell Checking , 2013, SIGHAN@IJCNLP.

[8]  Wen-Lian Hsu,et al.  Sinica-IASL Chinese spelling check system at Sighan-7 , 2013, SIGHAN@IJCNLP.

[9]  Thomas Emerson,et al.  The Second International Chinese Word Segmentation Bakeoff , 2005, IJCNLP.

[10]  Keh-Jiann Chen,et al.  Introduction to CKIP Chinese Spelling Check System for SIGHAN Bakeoff 2013 Evaluation , 2013, SIGHAN@IJCNLP.

[11]  Hai Zhao Incorporating Global Information into Supervised Learning for Chinese Word Segmentation , 2007 .

[12]  Chuan-Jie Lin,et al.  NTOU Chinese Spelling Check System in SIGHAN Bake-off 2013 , 2013, SIGHAN@IJCNLP.

[13]  Hai Zhao,et al.  Grammatical Error Detection and Correction using a Single Maximum Entropy Model , 2014, CoNLL Shared Task.

[14]  Hai Zhao,et al.  Spell Checking for Chinese , 2012, LREC.

[15]  Lung-Hao Lee,et al.  Chinese Spelling Check Evaluation at SIGHAN Bake-off 2013 , 2013, SIGHAN@IJCNLP.

[16]  Hai Zhao,et al.  Effective Tag Set Selection in Chinese Word Segmentation via Conditional Random Field Modeling , 2006, PACLIC.

[17]  Jason S. Chang,et al.  機器翻譯為本的中文拼字改錯系統 (Chinese Spelling Checker Based on Statistical Machine Translation) , 2013, ROCLING.

[18]  Qun Liu,et al.  HHMM-based Chinese Lexical Analyzer ICTCLAS , 2003, SIGHAN.

[19]  Hai Zhao,et al.  An Improved Chinese Word Segmentation System with Conditional Random Field , 2006, SIGHAN@COLING/ACL.

[20]  Yuji Matsumoto,et al.  A Hybrid Chinese Spelling Correction Using Language Model and Statistical Machine Translation with Reranking , 2013, SIGHAN@IJCNLP.

[21]  Hai Zhao,et al.  Grammatical Error Correction as Multiclass Classification with Single Model , 2013, CoNLL Shared Task.

[22]  Xu Sun,et al.  Learning Phrase-Based Spelling Error Models from Clickthrough Data , 2010, ACL.

[23]  Hsin-Hsi Chen,et al.  A Study of Language Modeling for Chinese Spelling Check , 2013, SIGHAN@IJCNLP.

[24]  Hai Zhao,et al.  Unsupervised Segmentation Helps Supervised Learning of Character Tagging for Word Segmentation and Named Entity Recognition , 2008, IJCNLP.

[25]  Chung-Hsien Wu,et al.  Candidate Scoring Using Web-Based Measure for Chinese Spelling Error Correction , 2013, SIGHAN@IJCNLP.

[26]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[27]  Hai Zhao,et al.  A Unified Character-Based Tagging Framework for Chinese Word Segmentation , 2010, TALIP.

[28]  Stanley F. Chen,et al.  An empirical study of smoothing techniques for language modeling , 1999 .

[29]  Sharon Goldwater,et al.  Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing , 2005 .

[30]  Hai Zhao,et al.  Integrative Semantic Dependency Parsing via Efficient Large-scale Feature Selection , 2013, J. Artif. Intell. Res..

[31]  Hai Zhao,et al.  An Empirical Comparison of Goodness Measures for Unsupervised Chinese Word Segmentation with a Unified Framework , 2008, IJCNLP.

[32]  Mauro Cettolo,et al.  IRSTLM: an open source toolkit for handling large scale language models , 2008, INTERSPEECH.

[33]  Xu Sun,et al.  A Large Scale Ranker-Based System for Search Query Spelling Correction , 2010, COLING.