Error Diagnosis of Chinese Sentences Using Inductive Learning Algorithm and Decomposition-Based Testing Mechanism

This study presents a novel approach to error diagnosis of Chinese sentences for Chinese as second language (CSL) learners. A penalized probabilistic First-Order Inductive Learning (pFOIL) algorithm is presented for error diagnosis of Chinese sentences. The pFOIL algorithm integrates inductive logic programming (ILP), First-Order Inductive Learning (FOIL), and a penalized log-likelihood function for error diagnosis. This algorithm considers the uncertain, imperfect, and conflicting characteristics of Chinese sentences to infer error types and produce human-interpretable rules for further error correction. In a pFOIL algorithm, relation pattern background knowledge and quantized t-score background knowledge are proposed to characterize a sentence and then used for likelihood estimation. The relation pattern background knowledge captures the morphological, syntactic and semantic relations among the words in a sentence. One or two kinds of the extracted relations are then integrated into a pattern to characterize a sentence. The quantized t-score values are used to characterize various relations of a sentence for quantized t-score background knowledge representation. Afterwards, a decomposition-based testing mechanism which decomposes a sentence into background knowledge set needed for each error type is proposed to infer all potential error types and causes of the sentence. With the pFOIL method, not only the error types but also the error causes and positions can be provided for CSL learners. Experimental results reveal that the pFOIL method outperforms the C4.5, maximum entropy, and Naive Bayes classifiers in error classification.

[1]  Chin-Chuan Cheng Computer-Based Chinese Teaching Program at Illinois. , 1972 .

[2]  Terence Odlin,et al.  Language Transfer: Cross-Linguistic Influence in Language Learning , 1989 .

[3]  Charles N. Li,et al.  Mandarin Chinese: A Functional Reference Grammar , 1989 .

[4]  Terence Odlin,et al.  Language Transfer: Contents , 1989 .

[5]  Keh-Jiann Chen,et al.  Word Identification for Mandarin Chinese Sentences , 1992, COLING.

[6]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[7]  Keh-Jiann Chen,et al.  Unknown Word Detection for Chinese by a Corpus-based Learning Method , 1998, ROCLING/IJCLCLP.

[8]  Stephen Muggleton,et al.  Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic , 1999, Artif. Intell..

[9]  J. Packard The Morphology of Chinese: A Linguistic and Cognitive Approach , 2000 .

[10]  Keh-Jiann Chen,et al.  Unknown Word Extraction for Chinese Documents , 2002, COLING.

[11]  Keh-Jiann Chen,et al.  Introduction to CKIP Chinese Word Segmentation System for the First International Chinese Word Segmentation Bakeoff , 2003, SIGHAN.

[12]  Keh-Jiann Chen,et al.  Context-rule Model for Pos Tagging , 2003, PACLIC.

[13]  Charlie Daly,et al.  Mass production of individual feedback , 2003, ITiCSE '04.

[14]  Chu-Ren Huang,et al.  Sinica BOW (Bilingual Ontological Wordnet): Integration of Bilingual WordNet and SUMO , 2004, LREC.

[15]  Keh-Jiann Chen,et al.  Design of CKIP Chinese Word Segmentation System , 2004, J. Chin. Lang. Comput..

[16]  J. Ross Quinlan,et al.  Learning logical definitions from relations , 1990, Machine Learning.

[17]  Michael Gamon,et al.  Correcting ESL Errors Using Phrasal SMT Techniques , 2006, ACL.

[18]  Wenying Wendy Jiang,et al.  Acquisition of word order in Chinese as a foreign language , 2009 .

[19]  N. A-R A E H A N,et al.  Detecting errors in English article usage by non-native speakers , 2006 .

[20]  Na-Rae Han,et al.  Detecting errors in English article usage by non-native speakers , 2006, Natural Language Engineering.

[21]  Stephen G. Pulman,et al.  Automatically Acquiring Models of Preposition Use , 2007, ACL 2007.

[22]  N. H. Beebe A Complete Bibliography of ACM Transactions on Asian Language Information Processing , 2007 .

[23]  Na-Rae Han,et al.  Detection of Grammatical Errors Involving Prepositions , 2007, ACL 2007.

[24]  Adrian Paschke,et al.  Inductive Logic Programming for Bioinformatics in Prova , 2007 .

[25]  Luc De Raedt,et al.  Integrating Naïve Bayes and FOIL , 2007, J. Mach. Learn. Res..

[26]  Rachele De Felice,et al.  A Classifier-Based Approach to Preposition and Determiner Error Correction in L2 English , 2008, COLING.

[27]  Herng-Yow Chen,et al.  Web-based synchronized multimedia lecture system design for teaching/learning Chinese as second language , 2008, Comput. Educ..

[28]  Stephanie Seneff,et al.  Correcting Misuse of Verb Forms , 2008, ACL.

[29]  Ziguang Zheng,et al.  An Approach to Context-Aware Mobile Chinese Language Learning for Foreign Students , 2009, 2009 Eighth International Conference on Mobile Business.

[30]  Daniel Jurafsky,et al.  Discriminative Reordering with Chinese Grammatical Relations Features , 2009, SSST@HLT-NAACL.

[31]  Chao-Lin Liu,et al.  Phonological and Logographic Influences on Errors in Written Chinese Words , 2009, ALR7@IJCNLP.

[32]  Hong-Ren Chen,et al.  Content Design for Situated Game-Based Learning: An Exploration of Chinese Language Poetry Learning , 2009, 2009 International Conference on Computational Intelligence and Software Engineering.

[33]  Michael Gamon,et al.  User Input and Interactions on Microsoft Research ESL Assistant , 2009, BEA@NAACL.

[34]  Chao-Lin Liu,et al.  Capturing Errors in Written Chinese Words , 2009, ACL/IJCNLP.

[35]  Maoqiang Xie,et al.  A Chinese E-learning Network Platform Based on Web2.0 , 2009, 2009 International Conference on Information Management, Innovation Management and Industrial Engineering.

[36]  Takehiko Yoshimi,et al.  Automatic Classification of Language Learner Sentences into Native-like or Non-Native-like Based on Word Alignment Distribution , 2009 .

[37]  Chung-Hsien Wu,et al.  Sentence Correction Incorporating Relative Position and Parse Template Language Models , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[38]  Charles M. Browne,et al.  New Perspectives on CALL for Second Language Classrooms , 2013 .