Sentence diagrams: their evaluation and combination

The purpose of our work is to explore the possibility of using sentence diagrams produced by schoolchildren as training data for automatic syntactic analysis. We have implemented a sentence diagram editor that schoolchildren can use to practice morphology and syntax. We collect their diagrams, combine them into a single diagram for each sentence and transform them into a form suitable for training a particular syntactic parser. In this study, the object language is Czech, where sentence diagrams are part of elementary school curriculum, and the target format is the annotation scheme of the Prague Dependency Treebank. We mainly focus on the evaluation of individual diagrams and on their combination into a merged better version.

[1]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[2]  Vikas Sindhwani,et al.  Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria , 2009, HLT-NAACL 2009.

[3]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[4]  Hajo Hippner,et al.  Crowdsourcing , 2012, Business & Information Systems Engineering.

[5]  Mihai Surdeanu,et al.  Ensemble Models for Dependency Parsing: Cheap and Good? , 2010, HLT-NAACL.

[6]  菅山 謙正,et al.  Word Grammar 理論の研究 , 2005 .

[7]  Min-Yen Kan,et al.  Perspectives on crowdsourcing annotations for natural language processing , 2012, Language Resources and Evaluation.

[8]  Kalina Bontcheva,et al.  Crowdsourcing research opportunities: lessons from natural language processing , 2012, i-KNOW '12.

[9]  Khalid Choukri,et al.  The european language resources association , 1998, LREC.

[10]  Eugene Charniak,et al.  Automatic Domain Adaptation for Parsing , 2010, NAACL.

[11]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[12]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[13]  Yuji Matsumoto MaltParser: A language-independent system for data-driven dependency parsing , 2005 .

[14]  Petr Sgall,et al.  The Meaning Of The Sentence In Its Semantic And Pragmatic Aspects , 1986 .

[15]  Philip Bille,et al.  A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..

[16]  Jirka Hana,et al.  Getting more data - Schoolkids as annotators , 2012, LREC.

[17]  Yi Zhang,et al.  Cross-Domain Dependency Parsing Using a Deep Linguistic Grammar , 2009, ACL/IJCNLP.

[18]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .