A Robust Shallow Parser for Swedish

In this paper, a robust parser for Swedish is presented. The parser identifies the internal structure of phrases, but does not build full trees. In addition to phrase identification, clause boundaries are detected. The parser is designed for robustness against noisy and ill-formed data. An evaluation on 15 000 words shows that the parser’s accuracy on phrase bracketing is 88.7 per cent and the F-score for clause boundary identification is 88.3 per cent.

[1]  Atro Voutilainen,et al.  A language-independent system for parsing unrestricted text , 1995 .

[2]  Steven Abney,et al.  Parsing By Chunks , 1991 .

[3]  Hans Weigand,et al.  Noun Phrase Representation by System Combination , 2000 .

[4]  Beáta Megyesi,et al.  Shallow Parsing with PoS Taggers and Linguistic Features , 2002, J. Mach. Learn. Res..

[5]  Karen Jensen PEG: The PLNLP English Grammar , 1993, Natural Language Processing.

[6]  Dan Roth,et al.  Exploring evidence for shallow parsing , 2001, CoNLL.

[7]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[8]  Eva I. Ejerhed,et al.  Finite state segmentation of discourse into clauses , 1996, Natural Language Engineering.

[9]  Wolfgang Menzel,et al.  Robust Processing of Natural Language , 1995, KI.

[10]  Johnny Bigert Robust Error Detection: A Hybrid Approach Combining Unsupervised Error Detection and Linguistic Knowledge , 2002 .

[11]  Gunnel Källgren Parsing without lexicon: the MorP system , 1991, EACL.

[12]  Anna Sågvall Hein,et al.  An Experimental Parser , 1982, COLING.

[13]  Gunnar Eriksson,et al.  The Linguistic Annotation System of the Stockholm - Umea , 1993, EACL.

[14]  Ola Knutsson,et al.  Automatic Evaluation of Robustness and Degradation in Tagging and Parsing , 2003 .

[15]  Rickard Domeij,et al.  Granska-an efficient hybrid system for Swedish grammar checking , 1999, NODALIDA.

[16]  Atro Voutilainen Parsing Swedish , 2001, NODALIDA.

[17]  Lance A. Miller,et al.  Parse Fitting and Prose Fixing: Getting a Hold on III-Formedness , 1983, Am. J. Comput. Linguistics.

[18]  Jörg Tiedemann,et al.  Scaling Up an MT Prototype for Industrial Use - Databases and Data Flow , 2002, LREC.

[19]  Björn Gambäck Processing Swedish sentences : a unification-based grammar and some applications , 1997 .

[20]  Grace Ngai,et al.  Transformation Based Learning in the Fast Lane , 2001, NAACL.

[21]  Benny Brodda An Experiment With Heuristic Parsing Of Swedish , 1983, EACL.

[22]  Walter Daelemans,et al.  Introduction to Special Issue on Machine Learning Approaches to Shallow Parsing , 2002, J. Mach. Learn. Res..

[23]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[24]  Roberto Basili,et al.  Parsing engineering and empirical robustness , 2002, Natural Language Engineering.

[25]  Sofie Johansson Kokkinakis,et al.  A Cascaded Finite-State Parser for Syntactic Analysis of Swedish , 1999, EACL.