Developing Universal Dependencies for Mandarin Chinese

This article proposes a Universal Dependency Annotation Scheme for Mandarin Chinese, including POS tags and dependency analysis. We identify cases of idiosyncrasy of Mandarin Chinese that are difficult to fit into the current schema which has mainly been based on the descriptions of various Indo-European languages. We discuss differences between our scheme and those of the Stanford Chinese Dependencies and the Chinese Dependency Treebank.

[1]  Charles N. Li,et al.  Mandarin Chinese: A Functional Reference Grammar , 1989 .

[2]  Joakim Nivre,et al.  Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.

[3]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[4]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[5]  Waltraud Paul,et al.  New Perspectives on Chinese Syntax , 2014 .

[6]  Robert Forkel,et al.  The World Atlas of Language Structures Online , 2009 .

[7]  Yue Zhang,et al.  A Semantics Oriented Grammar for Chinese Treebanking , 2014, CICLing.

[8]  Rui Peng,et al.  Chinese Descriptive Pivotal Construction: Taxonomy and Prototypicality*: , 2016 .

[9]  Fei Xia The Part-Of-Speech Tagging Guidelines for the Penn Chinese Treebank (3.0) , 2000 .

[10]  Daniel Jurafsky,et al.  Discriminative Reordering with Chinese Grammatical Relations Features , 2009, SSST@HLT-NAACL.

[11]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[12]  Christopher D. Manning,et al.  Improving chinese-english machine translation through better source-side linguistic processing , 2009 .

[13]  趙 元任,et al.  A grammar of spoken Chinese = 中國話的文法 , 1968 .

[14]  John Whitman,et al.  Postpositions vs Prepositions in Mandarin Chinese: The Articulation of Disharmony* , 2013 .

[15]  Timothy Osborne,et al.  Diagnostics for Constituents: Dependency, Constituency, and the Status of Function Words , 2015, DepLing.

[16]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[17]  John Lee Toward a Parallel Corpus of Spoken Cantonese and Written Chinese , 2011, IJCNLP.

[18]  Yuji Matsumoto,et al.  Universal Dependencies for Japanese , 2016, LREC.

[19]  Sylvain Kahane,et al.  Dependency Annotation Choices: Assessing Theoretical and Practical Issues of Universal Dependencies , 2016, LAW@ACL.

[20]  Petr Sgall,et al.  The Meaning Of The Sentence In Its Semantic And Pragmatic Aspects , 1986 .

[21]  Leo Wanner,et al.  Creating an MTT treebank of Spanish , 2009 .

[22]  Fei Xia The Segmentation Guidelines for the Penn Chinese Treebank (3.0) , 2000 .