Grammatical Relations in Chinese: GB-Ground Extraction and Data-Driven Parsing

This paper is concerned with building linguistic resources and statistical parsers for deep grammatical relation (GR) analysis of Chinese texts. A set of linguistic rules is defined to explore implicit phrase structural information and thus build high-quality GR annotations that are represented as general directed dependency graphs. The reliability of this linguistically-motivated GR extraction procedure is highlighted by manual evaluation. Based on the converted corpus, we study transition-based, datadriven models for GR parsing. We present a novel transition system which suits GR graphs better than existing systems. The key idea is to introduce a new type of transition that reorders top k elements in the memory module. Evaluation gauges how successful GR parsing for Chinese can be by applying datadriven models.

[1]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[2]  Fei Xia,et al.  Automatic grammar generation from two different perspectives , 2001 .

[3]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[4]  Emily M. Bender,et al.  Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus , 2011, EMNLP.

[5]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[6]  Stephen Clark,et al.  Transition-Based Parsing of the Chinese Treebank using a Global Discriminative Model , 2009, IWPT.

[7]  Kun Yu,et al.  Analysis of the Difficulties in Chinese Deep Parsing , 2011, IWPT.

[8]  Ivan Titov,et al.  Online graph planarisation for synchronous parsing of semantic and syntactic dependencies , 2009, IJCAI 2009.

[9]  Jun'ichi Tsujii,et al.  Fine-Grained Tree-to-String Translation Rule Extraction , 2010, ACL.

[10]  M. A. R T A P A L,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[11]  Eric P. Xing,et al.  Concise Integer Linear Programming Formulations for Dependency Parsing , 2009, ACL.

[12]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[13]  James R. Curran,et al.  Formalism-Independent Parser Evaluation with CCG and DepBank , 2007, ACL.

[14]  Ted Briscoe,et al.  Evaluating the Accuracy of an Unlexicalized Statistical Parser on the PARC DepBank , 2006, ACL.

[15]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[16]  Mary Dalrymple,et al.  The PARC 700 Dependency Bank , 2003, LINC@EACL.

[17]  Stephen Clark,et al.  Shift-Reduce CCG Parsing , 2011, ACL.

[18]  Yusuke Miyao,et al.  Towards Framework-Independent Evaluation of Deep Linguistic Parsers , 2007 .

[19]  Nianwen Xue,et al.  Tapping the Implicit Information for the PS to DS Conversion of the Chinese Treebank , 2007 .

[20]  M. Baltin,et al.  The Mental representation of grammatical relations , 1985 .

[21]  Fernando Pereira,et al.  Discriminative learning and spanning tree algorithms for dependency parsing , 2006 .

[22]  Joakim Nivre,et al.  Memory-Based Dependency Parsing , 2004, CoNLL.

[23]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[24]  Michael Collins,et al.  Efficient Third-Order Dependency Parsers , 2010, ACL.

[25]  Jun'ichi Tsujii,et al.  Shift-Reduce Dependency DAG Parsing , 2008, COLING.

[26]  Weiwei Sun,et al.  Data-driven, PCFG-based and Pseudo-PCFG-based Models for Chinese Dependency Parsing , 2013, Transactions of the Association for Computational Linguistics.

[27]  Jun'ichi Tsujii,et al.  Feature Forest Models for Probabilistic HPSG Parsing , 2008, CL.

[28]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[29]  Jun'ichi Tsujii,et al.  Task-oriented Evaluation of Syntactic Parsers and Their Representations , 2008, ACL.

[30]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[31]  Janina Maier,et al.  Syntax A Generative Introduction , 2016 .

[32]  Weiwei Sun,et al.  Capturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging , 2012, ACL.

[33]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[34]  James R. Curran,et al.  The Challenges of Parsing Chinese with Combinatory Categorial Grammar , 2012, HLT-NAACL.

[35]  Stefan Riezler,et al.  Speed and Accuracy in Shallow and Deep Stochastic Parsing , 2004, NAACL.

[36]  Mark Steedman,et al.  Unbounded Dependency Recovery for Parser Evaluation , 2009, EMNLP.

[37]  James R. Curran,et al.  Chinese CCGbank: extracting CCG derivations from the Penn Chinese Treebank , 2010, COLING.

[38]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[39]  Kun Yu,et al.  Semi-automatically Developing Chinese HPSG Grammar from the Penn Chinese Treebank for Deep Parsing , 2010, COLING.

[40]  Weiwei Sun,et al.  Enhancing Chinese Word Segmentation Using Unlabeled Data , 2011, EMNLP.

[41]  Josef van Genabith,et al.  Treebank-based acquisition of LFG resources for Chinese , 2007 .