Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

We investigate active learning methods for Japanese dependency parsing. We propose active learning methods of using partial dependency relations in a given sentence for parsing and evaluate their effectiveness empirically. Furthermore, we utilize syntactic constraints of Japanese to obtain more labeled examples from precious labeled ones that annotators give. Experimental results show that our proposed methods improve considerably the learning curve of Japanese dependency parsing. In order to achieve an accuracy of over 88.3%, one of our methods requires only 34.4% of labeled examples as compared to passive learning.

[1]  Eric K. Ringger,et al.  Assessing the Costs of Sampling Methods in Active Learning for Annotation , 2008, ACL.

[2]  Yuji Matsumoto,et al.  Japanese Dependency Structure Analysis Based on Support Vector Machines , 2000, EMNLP.

[3]  Min Tang,et al.  Active Learning for Statistical Natural Language Parsing , 2002, ACL.

[4]  Rebecca Hwa,et al.  Sample Selection for Statistical Parsing , 2004, CL.

[5]  FreundYoav,et al.  Large Margin Classification Using the Perceptron Algorithm , 1999 .

[6]  Kiyonori Ohtake,et al.  Analysis of Selective Strategies to Build a Dependency-Analyzed Corpus , 2006, ACL.

[7]  Hinrich Schütze,et al.  Stopping Criteria for Active Learning of Named Entity Recognition , 2008, COLING.

[8]  Eric K. Ringger,et al.  Active Learning for Part-of-Speech Tagging: Accelerating Corpus Annotation , 2007, LAW@ACL.

[9]  Manabu Sassano,et al.  Linear-Time Dependency Analysis for Japanese , 2004, COLING.

[10]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[11]  Hitoshi Isahara,et al.  Japanese Dependency Structure Analysis Based on Maximum Entropy Models , 1999, EACL.

[12]  Jason Baldridge,et al.  Active Learning and the Total Cost of Annotation , 2004, EMNLP.

[13]  Jingbo Zhu,et al.  Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem , 2007, EMNLP.

[14]  Manabu Sassano,et al.  An Empirical Study of Active Learning with Support Vector Machines for Japanese Word Segmentation , 2002, ACL.

[15]  Yuji Matsumoto,et al.  Japanese Dependency Parsing Using a Tournament Model , 2008, COLING.

[16]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[17]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[18]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[19]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[20]  Makoto Nagao,et al.  Building a Japanese parsed corpus while improving the parsing system , 1997 .

[21]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[22]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[23]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[24]  Yuji Matsumoto,et al.  Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[25]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.