How to Train Dependency Parsers with Inexact Search for Joint Sentence Boundary Detection and Parsing of Entire Documents

We cast sentence boundary detection and syntactic parsing as a joint problem, so an entire text document forms a training instance for transition-based dependency parsing. When trained with an early update or max-violation strategy for inexact search, we observe that only a tiny part of these very long training instances is ever exploited. We demonstrate this effect by extending the ArcStandard transition system with swap for the joint prediction task. When we use an alternative update strategy, our models are considerably better on both tasks and train in substantially less time compared to models trained with early update/max-violation. A comparison between a standard pipeline and our joint model furthermore empirically shows the usefulness of syntactic information on the task of sentence boundary detection.

[1]  Johan Bos,et al.  Elephant: Sequence Labeling for Word and Sentence Segmentation , 2013, EMNLP.

[2]  Yue Zhang,et al.  Chinese Parsing Exploiting Characters , 2013, ACL.

[3]  Yue Zhang,et al.  Character-Level Chinese Dependency Parsing , 2014, ACL.

[4]  Tibor Kiss,et al.  Unsupervised Multilingual Sentence Boundary Detection , 2006, CL.

[5]  Jonas Kuhn,et al.  Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features , 2014, ACL.

[6]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[7]  Shuangzhi Wu,et al.  Punctuation Prediction with Transition-based Parsing , 2013, ACL.

[8]  Stephan Oepen,et al.  Document Parsing: Towards Realistic Syntactic Analysis , 2013, IWPT.

[9]  Elizabeth Shriberg,et al.  Using Prosody for Automatic Sentence Segmentation of Multi-party Meetings , 2006, TSD.

[10]  Stephan Oepen,et al.  Sentence Boundary Detection: A Long Solved Problem? , 2012, COLING.

[11]  Daniel Marcu,et al.  Learning as search optimization: approximate large margin methods for structured prediction , 2005, ICML.

[12]  Yang Guo,et al.  Structured Perceptron with Inexact Search , 2012, NAACL.

[13]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[14]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[15]  Eugene Charniak,et al.  Edit Detection and Parsing for Transcribed Speech , 2001, NAACL.

[16]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Approach to Identifying Sentence Boundaries , 1997, ANLP.

[17]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[18]  Joakim Nivre,et al.  An Improved Oracle for Dependency Parsing with Online Reordering , 2009, IWPT.

[19]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[20]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[21]  Joakim Nivre,et al.  Joint Morphological and Syntactic Analysis for Richly Inflected Languages , 2013, TACL.

[22]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[23]  Haitao Mi,et al.  Max-Violation Perceptron and Forced Decoding for Scalable MT Training , 2013, EMNLP.

[24]  Kenji Sagae,et al.  Dynamic Programming for Linear-Time Incremental Parsing , 2010, ACL.

[25]  Jun'ichi Tsujii,et al.  Incremental Joint POS Tagging and Dependency Parsing in Chinese , 2011, IJCNLP.

[26]  Brian Roark,et al.  Incremental Parsing with the Perceptron Algorithm , 2004, ACL.

[27]  Kai Zhao,et al.  Type-Driven Incremental Semantic Parsing with Polymorphism , 2014, NAACL.

[28]  Joakim Nivre,et al.  Feature Description for the Transition-Based Parser for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing , 2012 .

[29]  Jonas Kuhn,et al.  The Best of Both Worlds – A Graph-based Completion Model for Transition-based Parsers , 2012, EACL.

[30]  Yoav Goldberg,et al.  Efficient Implementation of Beam-Search Incremental Parsers , 2013, ACL.

[31]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[32]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[33]  Hinrich Schütze,et al.  Efficient Higher-Order CRFs for Morphological Tagging , 2013, EMNLP.

[34]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.