论文信息 - Analyzing the Errors of Unsupervised Learning - 字舞流文

Analyzing the Errors of Unsupervised Learning

We identify four types of errors that unsupervised induction systems make and study each one in turn. Our contributions include (1) using a meta-model to analyze the incorrect biases of a model in a systematic way, (2) providing an efficient and robust method of measuring distance between two parameter settings of a model, and (3) showing that local optima issues which typically plague EM can be somewhat alleviated by increasing the number of training examples. We conduct our analyses on three models: the HMM, the PCFG, and a simple dependency model.

Dan Klein | Percy Liang | P. Liang | D. Klein

[1] Fernando Pereira,et al. Inside-Outside Reestimation From Partially Bracketed Corpora , 1992, HLT.

[2] Glenn Carroll,et al. Two Experiments on Learning Probabilistic Dependency Grammars from Corpora , 1992 .

[3] Bernard Mérialdo,et al. Tagging English Text with a Probabilistic Model , 1994, CL.

[4] Dana Ron,et al. On the learnability and usage of acyclic probabilistic finite automata , 1995, COLT '95.

[5] Sanjoy Dasgupta,et al. Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[6] Pieter W. Adriaans,et al. Learning Shallow Context-free Languages under Simple Distributions , 2001 .

[7] Alexander Clark. Unsupervised induction of stochastic context-free grammars using distributional clustering , 2001, CoNLL.

[8] Alexander Clark,et al. PAC-learnability of Probabilistic Deterministic Finite State Automata , 2004, J. Mach. Learn. Res..

[9] Dan Klein,et al. Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[10] Dan Klein,et al. Unsupervised Learning of Field Segmentation Models for Information Extraction , 2005, ACL.

[11] Noah A. Smith,et al. Contrastive Estimation: Training Log-Linear Models on Unlabeled Data , 2005, ACL.

[12] J. Feldman,et al. Learning mixtures of product distributions over discrete domains , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[13] Noah A. Smith,et al. Annealing Structural Bias in Multilingual Weighted Grammar Induction , 2006, ACL.

[14] Gregory Shakhnarovich,et al. An investigation of computational and informational limits in Gaussian mixture clustering , 2006, ICML '06.

[15] Kenichi Kurihara,et al. Variational Bayesian Grammar Induction for Natural Language , 2006, ICGI.

[16] Thomas L. Griffiths,et al. A fully Bayesian approach to unsupervised part-of-speech tagging , 2007, ACL.

[17] Sanjoy Dasgupta,et al. A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians , 2007, J. Mach. Learn. Res..

[18] Mark Johnson,et al. Why Doesn’t EM Find Good HMM POS-Taggers? , 2007, EMNLP.

[19] Harold W. Kuhn,et al. The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.