Initial Explorations of CCG Supertagging for Universal Dependency Parsing

In this paper we describe the system by METU team for universal dependency parsing of multilingual text. We use a neu- ral network-based dependency parser that has a greedy transition approach to dependency parsing. CCG supertags contain rich structural information that proves useful in certain NLP tasks. We experiment with CCG supertags as additional features in our experiments. The neural network parser is trained together with dependencies and simplified CCG tags as well as other features provided.

[1]  Philipp Koehn,et al.  CCG Supertags in Factored Statistical Machine Translation , 2007, WMT@ACL.

[2]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[3]  Mark Steedman,et al.  Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[4]  Noah A. Smith,et al.  What Do Recurrent Neural Network Grammars Learn About Syntax? , 2016, EACL.

[5]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[6]  Mark Steedman,et al.  Evaluating Induced CCG Parsers on Grounded Semantic Parsing , 2016, EMNLP.

[7]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[8]  Yuji Matsumoto,et al.  Improving Dependency Parsers with Supertags , 2014, EACL.

[9]  Mark Steedman,et al.  Using CCG categories to improve Hindi dependency parsing , 2013, ACL.

[10]  Philipp Koehn,et al.  Syntax-aware Neural Machine Translation Using CCG , 2017, ArXiv.

[11]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[14]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[15]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[16]  Daniel Zeman,et al.  CoNLL 2017 Shared Task - Automatically Annotated Raw Texts and Word Embeddings , 2017 .

[17]  Fernando Pereira,et al.  Discriminative learning and spanning tree algorithms for dependency parsing , 2006 .

[18]  James R. Curran,et al.  The Importance of Supertagging for Wide-Coverage CCG Parsing , 2004, COLING.

[19]  Benno Stein,et al.  Improving the Reproducibility of PAN's Shared Tasks: - Plagiarism Detection, Author Identification, and Author Profiling , 2014, CLEF.

[20]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[21]  Yuji Matsumoto,et al.  Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data , 2017 .

[22]  Adam Lopez,et al.  Efficient CCG Parsing: A* versus Adaptive Supertagging , 2011, ACL.

[23]  Wolfgang Menzel,et al.  Guiding a Constraint Dependency Parser with Supertags , 2006, ACL.

[24]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[25]  Martin Potthast,et al.  CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2018, CoNLL.

[26]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[27]  Agnieszka Falenska,et al.  Stacking or Supertagging for Dependency Parsing - What's the Difference? , 2015, IWPT.

[28]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[29]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[30]  Ruket Cakici,et al.  Wide-coverage parsing for Turkish , 2009 .

[31]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[32]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[33]  Mark Steedman,et al.  A* CCG Parsing with a Supertag-factored Model , 2014, EMNLP.

[34]  Jan Hajic,et al.  UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing , 2016, LREC.

[35]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.