Language Independent Probabilistic Context-Free Parsing Bolstered by Machine Learning

Unlexicalized probabilistic context-free parsing is a general and flexible approach that sometimes reaches competitive results in multilingual dependency parsing even if a minimum of language-specific information is supplied. Furthermore, integrating parser results (good at long dependencies) and tagger results (good at short range dependencies, and more easily adaptable to treebank peculiarities) gives competitive results in all languages.

[1]  Jan Hajic,et al.  Prague Arabic Dependency Treebank: Development in Data and Tools , 2004 .

[2]  Haim Gaifman,et al.  Dependency Systems and Phrase-Structure Systems , 1965, Inf. Control..

[3]  Helmut Schmid Efficient Parsing of Highly Ambiguous Context-Free Grammars with Bit Vectors , 2004, COLING.

[4]  Dilek Z. Hakkani-Tür,et al.  Building a Turkish Treebank , 2003 .

[5]  Sabine Brants,et al.  The TIGER Treebank , 2001 .

[6]  Fei Xia,et al.  Converting Dependency Structures to Phrase Structures , 2001, HLT.

[7]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[8]  Montserrat Civit Torruella,et al.  Design Principles for a Spanish Treebank , 2002 .

[9]  Gertjan van Noord,et al.  The Alpino Dependency Treebank , 2001, CLIN.

[10]  Kemal Oflazer,et al.  The Annotation Process in the Turkish Treebank , 2003, LINC@EACL.

[11]  Chu-Ren Huang,et al.  Sinica Treebank: Design Criteria, Representational Issues and Implementation , 2004 .

[12]  Saso Dzeroski,et al.  Towards a Slovene Dependency Treebank , 2006, LREC.

[13]  Joakim Nivre,et al.  MAMBA Meets TIGER: Reconstructing a Swedish Treebank from Antiquity , 2005 .

[14]  Michael Schiehlen,et al.  Combining Deep and Shallow Approaches in Parsing German , 2003, ACL.

[15]  Anne Abeillé,et al.  Treebanks: Building and Using Parsed Corpora , 2003 .

[16]  Helmut Schmid Trace Prediction and Recovery with Unlexicalized PCFGs and Slash Features , 2006, ACL.

[17]  Michael Collins,et al.  A Statistical Parser for Czech , 1999, ACL.

[18]  Eckhard Bick,et al.  Floresta Sintá(c)tica: A treebank for Portuguese , 2002, LREC.

[19]  Petya Osenova,et al.  Design and Implementation of the Bulgarian HPSG-based Treebank , 2004 .

[20]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[21]  Michael Schiehlen Annotation Strategies for Probabilistic Parsing in German , 2004, COLING.

[22]  P. Resnik Treebanks : Building and Using Parsed Corpora , 2022 .

[23]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.