Weighted Tree Automata Approximation by Singular Value Truncation

We describe a technique to minimize weighted tree automata (WTA), a powerful formalisms that subsumes probabilistic context-free grammars (PCFGs) and latent-variable PCFGs. Our method relies on a singular value decomposition of the underlying Hankel matrix defined by the WTA. Our main theoretical result is an efficient algorithm for computing the SVD of an infinite Hankel matrix implicitly represented as a WTA. We provide an analysis of the approximation error induced by the minimization, and we evaluate our method on real-world data originating in newswire treebank. We show that the model achieves lower perplexity than previous methods for PCFG minimization, and also is much more stable due to the absence of local optima.

[1]  Karl Stratos,et al.  Experiments with Spectral Learning of Latent-Variable PCFGs , 2013, HLT-NAACL.

[2]  Karl Stratos,et al.  Spectral learning of latent-variable PCFGs: algorithms and sample complexity , 2014, J. Mach. Learn. Res..

[3]  Nan Jiang,et al.  Low-Rank Spectral Learning with Weighted Loss Functions , 2015, AISTATS.

[4]  Giorgio Satta,et al.  Approximate PCFG Parsing Using Tensor Decomposition , 2013, NAACL.

[5]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[6]  Byron Boots,et al.  Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..

[7]  Karin Rothschild,et al.  A Course In Functional Analysis , 2016 .

[8]  Alex Kulesza,et al.  Low-Rank Spectral Learning , 2014, AISTATS.

[9]  Symeon Bozapalidis,et al.  The Rank of a Formal Tree Power Series , 1983, Theor. Comput. Sci..

[10]  Shay B. Cohen,et al.  Tensor Decomposition for Fast Parsing with Latent-Variable PCFGs , 2012, NIPS.

[11]  Joshua Goodman,et al.  Parsing Algorithms and Metrics , 1996, ACL.

[12]  James Worrell,et al.  Minimisation of Multiplicity Tree Automata , 2015, FoSSaCS.

[13]  Doina Precup,et al.  A Canonical Form for Weighted Automata and Applications to Approximate Minimization , 2015, 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science.

[14]  Ariadna Quattoni,et al.  Spectral learning of weighted automata , 2014, Machine Learning.

[15]  Tamara G. Kolda,et al.  On Tensors, Sparsity, and Nonnegative Factorizations , 2011, SIAM J. Matrix Anal. Appl..

[16]  Laurent El Ghaoui,et al.  Inversion Error, Condition Number, and Approximate Inverses of Uncertain Matrices , 2001 .

[17]  Liva Ralaivola,et al.  Grammatical inference as a principal component analysis problem , 2009, ICML '09.

[18]  Wojciech Skut,et al.  An Annotation Scheme for Free Word Order Languages , 1997, ANLP.

[19]  Jean Berstel,et al.  Recognizable Formal Power Series on Trees , 1982, Theor. Comput. Sci..

[20]  Amaury Habrard,et al.  A Spectral Approach for Probabilistic Grammatical Inference on Trees , 2010, ALT.

[21]  J. Ortega Numerical Analysis: A Second Course , 1974 .