Experiments with Spectral Learning of Latent-Variable PCFGs

Latent-variable PCFGs (L-PCFGs) are a highly successful model for natural language parsing. Recent work (Cohen et al., 2012) has introduced a spectral algorithm for parameter estimation of L-PCFGs, which—unlike the EM algorithm—is guaranteed to give consistent parameter estimates (it has PAC-style guarantees of sample complexity). This paper describes experiments using the spectral algorithm. We show that the algorithm provides models with the same accuracy as EM, but is an order of magnitude more efficient. We describe a number of key steps used to obtain this level of performance; these should be relevant to other work on the application of spectral learning algorithms. We view our results as strong empirical evidence for the viability of spectral methods as an alternative to EM.

[1]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[2]  Byron Boots,et al.  Reduced-Rank Hidden Markov Models , 2009, AISTATS.

[3]  Dean P. Foster,et al.  Multi-View Learning of Word Embeddings via CCA , 2011, NIPS.

[4]  Sanjeev Arora,et al.  Learning Topic Models -- Going beyond SVD , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[5]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[6]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[7]  Ralph Grishman,et al.  A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.

[8]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[9]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[10]  Le Song,et al.  A Spectral Algorithm for Latent Tree Graphical Models , 2011, ICML.

[11]  Ariadna Quattoni,et al.  A Spectral Learning Algorithm for Finite State Transducers , 2011, ECML/PKDD.

[12]  Eric P. Xing,et al.  Turbo Parsers: Dependency Parsing by Approximate Variational Inference , 2010, EMNLP.

[13]  Jun'ichi Tsujii,et al.  Probabilistic CFG with Latent Annotations , 2005, ACL.

[14]  Sham M. Kakade,et al.  Multi-view Regression Via Canonical Correlation Analysis , 2007, COLT.

[15]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[16]  Ariadna Quattoni,et al.  Spectral Learning for Non-Deterministic Dependency Parsing , 2012, EACL.

[17]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[18]  Hiroyuki Shindo,et al.  Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing , 2012, ACL.

[19]  Karl Stratos,et al.  Spectral Learning of Latent-Variable PCFGs , 2012, ACL.

[20]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[21]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[22]  Santosh S. Vempala,et al.  A spectral algorithm for learning mixtures of distributions , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[23]  Michael Collins,et al.  Spectral Dependency Parsing with Latent Variables , 2012, EMNLP-CoNLL.

[24]  Amaury Habrard,et al.  A Spectral Approach for Probabilistic Grammatical Inference on Trees , 2010, ALT.

[25]  Joshua Goodman,et al.  Parsing Algorithms and Metrics , 1996, ACL.

[26]  Xavier Carreras,et al.  TAG, Dynamic Programming, and the Perceptron for Efficient, Feature-Rich Parsing , 2008, CoNLL.