论文信息 - The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy

The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy

We present the ParisNLP entry at the UDCoNLL 2017 parsing shared task. In addition to the UDpipe models provided, we built our own data-driven tokenization models, sentence segmenter and lexicon- based morphological analyzers. All of these were used with a range of different parsing models (neural or not, feature-rich or not, transition or graph-based, etc.) and the best combination for each language was selected. Unfortunately, a glitch in the shared task’s Matrix led our model selector to run generic, weakly lexicalized mod- els, tailored for surprise languages, instead of our dataset-specific models. Because of this #ParsingTragedy, we officially ranked 27th, whereas our real models finally unofficially ranked 6th.

Benoît Sagot | Éric Villemonte de la Clergerie | Djamé Seddah

[1] Sabine Buchholz,et al. CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[2] Slav Petrov,et al. Overview of the 2012 Shared Task on Parsing the Web , 2012 .

[3] Nizar Habash,et al. Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages , 2013, SPMRL@EMNLP.

[4] Fernando Pereira,et al. Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[5] Jan Hajic,et al. UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing , 2016, LREC.

[6] Reut Tsarfaty,et al. Introducing the SPMRL 2014 Shared Task on Parsing Morphologically-rich Languages , 2014 .

[7] Sebastian Riedel,et al. The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[8] Éric Villemonte de la Clergerie,et al. Exploring beam-based shift-reduce dependency parsing with DyALog: Results from the SPMRL 2013 shared task , 2013, SPMRL@EMNLP.

[9] Martin Potthast,et al. CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2018, CoNLL.

[10] Sampo Pyysalo,et al. Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[11] Pascal Denis,et al. Coupling an annotated corpus and a lexicon for state-of-the-art POS tagging , 2012, Lang. Resour. Evaluation.

[12] Francis M. Tyers,et al. Universal Dependencies , 2017, EACL.

[13] Benno Stein,et al. Improving the Reproducibility of PAN's Shared Tasks: - Plagiarism Detection, Author Identification, and Author Profiling , 2014, CLEF.