A High-Performance Semi-Supervised Learning Method for Text Chunking

In machine learning, whether one can build a more accurate classifier by using unlabeled data (semi-supervised learning) is an important issue. Although a number of semi-supervised methods have been proposed, their effectiveness on NLP tasks is not always clear. This paper presents a novel semi-supervised method that employs a learning paradigm which we call structural learning. The idea is to find "what good classifiers are like" by learning from thousands of automatically generated auxiliary classification problems on unlabeled data. By doing so, the common predictive structure shared by the multiple classification problems can be discovered, which can then be used to improve performance on the target problem. The method produces performance higher than the previous best results on CoNLL'00 syntactic chunking and CoNLL'03 named entity chunking (English and German).

[1]  Bernard Mérialdo,et al.  Tagging English Text with a Probabilistic Model , 1994, CL.

[2]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[3]  Gene H. Golub,et al.  Matrix Computations, Third Edition , 1996 .

[4]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[5]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[6]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[7]  Yuji Matsumoto,et al.  Chunking with Support Vector Machines , 2001, NAACL.

[8]  Claire Cardie,et al.  Limitations of Co-Training for Natural Language Learning from Large Datasets , 2001, EMNLP.

[9]  Tong Zhang,et al.  Text Chunking based on a Generalization of Winnow , 2002, J. Mach. Learn. Res..

[10]  Hwee Tou Ng,et al.  Named Entity Recognition with a Maximum Entropy Approach , 2003, CoNLL.

[11]  Tong Zhang,et al.  A Robust Risk Minimization based Named Entity Recognition System , 2003, CoNLL.

[12]  Dan Klein,et al.  Named Entity Recognition with Character-Level Models , 2003, CoNLL.

[13]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[14]  Xavier Carreras,et al.  Phrase recognition by filtering and ranking with perceptrons , 2003, RANLP.

[15]  Claire Cardie,et al.  Weakly Supervised Natural Language Learning Without Redundant Views , 2003, NAACL.

[16]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[17]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[18]  Rie Kubota Ando,et al.  Semantic Lexicon Construction: Learning from Unlabeled Data via Spectral Analysis , 2004, CoNLL.

[19]  Scott Miller,et al.  Name Tagging with Word Clusters and Discriminative Training , 2004, NAACL.

[20]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..