A Graph-based Lattice Dependency Parser for Joint Morphological Segmentation and Syntactic Analysis

Space-delimited words in Turkish and Hebrew text can be further segmented into meaningful units, but syntactic and semantic context is necessary to predict segmentation. At the same time, predicting correct syntactic structures relies on correct segmentation. We present a graph-based lattice dependency parser that operates on morphological lattices to represent different segmentations and morphological analyses for a given input sentence. The lattice parser predicts a dependency tree over a path in the lattice and thus solves the joint task of segmentation, morphological analysis, and syntactic parsing. We conduct experiments on the Turkish and the Hebrew treebank and show that the joint model outperforms three state-of-the-art pipeline systems on both data sets. Our work corroborates findings from constituency lattice parsing for Hebrew and presents the first results for full lattice parsing on Turkish.

[1]  Eric P. Xing,et al.  Concise Integer Linear Programming Formulations for Dependency Parsing , 2009, ACL.

[2]  Jason Eisner Bilexical Grammars and a Cubic-time Probabilistic Parser , 1997, IWPT.

[3]  Kemal Oflazer,et al.  Two-level Description of Turkish Morphology , 1993, EACL.

[4]  Yoav Goldberg,et al.  Word Segmentation, Unknown-word Resolution, and Morphological Agreement in a Hebrew Parsing System , 2013, CL.

[5]  Kemal Oflazer,et al.  Dependency Parsing of Turkish , 2008, CL.

[6]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[7]  Haizhou Li,et al.  Joint Models for Chinese POS Tagging and Dependency Parsing , 2011, EMNLP.

[8]  Regina Barzilay,et al.  Randomized Greedy Inference for Joint Segmentation, POS Tagging and Dependency Parsing , 2015, HLT-NAACL.

[9]  Jun'ichi Tsujii,et al.  Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese , 2012, ACL.

[10]  Özlem Çetinoglu Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing , 2014, LREC.

[11]  Reut Tsarfaty,et al.  A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing , 2008, ACL.

[12]  Guodong Zhou,et al.  Unified Dependency Parsing of Chinese Morphological and Syntactic Structures , 2012, EMNLP.

[13]  Eric P. Xing,et al.  Turbo Parsers: Dependency Parsing by Approximate Variational Inference , 2010, EMNLP.

[14]  Yoav Goldberg,et al.  An Efficient Algorithm for Easy-First Non-Directional Dependency Parsing , 2010, NAACL.

[15]  Reut Tsarfaty,et al.  Integrated Morphological and Syntactic Disambiguation for Modern Hebrew , 2006, ACL.

[16]  Dilek Z. Hakkani-Tür,et al.  Building a Turkish Treebank , 2003 .

[17]  André F. T. Martins,et al.  A Joint Model for Quotation Attribution and Coreference Resolution , 2014, EACL.

[18]  Xavier Carreras,et al.  Experiments with a Higher-Order Projective Dependency Parser , 2007, EMNLP.

[19]  Noah A. Smith,et al.  Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers , 2013, ACL.

[20]  Khalil Sima'an,et al.  Building a tree-bank of modern hebrew text , 2001 .

[21]  André F. T. Martins,et al.  Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning , 2013, ACL.

[22]  Noah A. Smith,et al.  Dual Decomposition with Many Overlapping Components , 2011, EMNLP.

[23]  Kemal Oflazer Two-level description of Turkish morphology , 1993 .

[24]  Alexander M. Rush,et al.  Dual Decomposition for Parsing with Non-Projective Head Automata , 2010, EMNLP.

[25]  Yannick Versley,et al.  Statistical Parsing of Morphologically Rich Languages (SPMRL) What, How and Whither , 2010, SPMRL@NAACL-HLT.

[26]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[27]  Alexander M. Rush,et al.  On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing , 2010, EMNLP.

[28]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[29]  Feiliang Ren,et al.  Easy-First Chinese POS Tagging and Dependency Parsing , 2012, COLING.

[30]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[31]  Kemal Oflazer,et al.  Statistical Dependency Parsing for Turkish , 2006, EACL.

[32]  Eric P. Xing,et al.  AD3: alternating directions dual decomposition for MAP inference in graphical models , 2015, J. Mach. Learn. Res..

[33]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[34]  Bernd Bohnet Efficient Parsing of Syntactic and Semantic Dependency Structures , 2009, CoNLL Shared Task.

[35]  Yoav Goldberg,et al.  Easy-First Dependency Parsing of Modern Hebrew , 2010, SPMRL@NAACL-HLT.

[36]  Nizar Habash,et al.  Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages , 2013, SPMRL@EMNLP.

[37]  Bernd Bohnet,et al.  Very high accuracy and fast dependency parsing is not a contradiction , 2010, COLING 2010.

[38]  Noah A. Smith,et al.  An Exact Dual Decomposition Algorithm for Shallow Semantic Parsing with Constraints , 2012, *SEMEVAL.

[39]  Noah A. Smith,et al.  Joint Morphological and Syntactic Disambiguation , 2007, EMNLP.

[40]  Christopher D. Manning,et al.  Better Arabic Parsing: Baselines, Evaluations, and Analysis , 2010, COLING.

[41]  Evelina Andersson,et al.  Joint Evaluation of Morphological Segmentation and Syntactic Parsing , 2012, ACL.

[42]  Joakim Nivre,et al.  Joint Morphological and Syntactic Analysis for Richly Inflected Languages , 2013, TACL.

[43]  Ozan Arkan Can,et al.  Multiword Expressions in Statistical Dependency Parsing , 2011, SPMRL@IWPT.

[44]  Gülsen Eryigit The Impact of Automatic Morphological Analysis & Disambiguation on Dependency Parsing of Turkish , 2012, LREC.

[45]  Yue Zhang,et al.  Character-Level Chinese Dependency Parsing , 2014, ACL.

[46]  Eric P. Xing,et al.  An Augmented Lagrangian Approach to Constrained MAP Inference , 2011, ICML.

[47]  Joakim Nivre,et al.  A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing , 2012, EMNLP.

[48]  Murat Saraclar,et al.  Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus , 2008, GoTAL.

[49]  Reut Tsarfaty,et al.  Introducing the SPMRL 2014 Shared Task on Parsing Morphologically-rich Languages , 2014 .

[50]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[51]  Jun'ichi Tsujii,et al.  Incremental Joint POS Tagging and Dependency Parsing in Chinese , 2011, IJCNLP.

[52]  Yoav Goldberg,et al.  Hebrew Dependency Parsing: Initial Results , 2009, IWPT.

[53]  Jonas Kuhn,et al.  Towards Joint Morphological Analysis and Dependency Parsing of Turkish , 2013, DepLing.

[54]  Stephen Tratz A Cross-Task Flexible Transition Model for Arabic Tokenization, Affix Detection, Affix Labeling, POS Tagging, and Dependency Parsing , 2013, SPMRL@EMNLP.