The Hindi/Urdu Treebank Project

The goal of Hindi/Urdu treebanking project is to build multi-layered treebanks that will provide both syntactic and semantic annotations. In the past two decades, dozens of treebanks have been created for languages such as Arabic, Chinese, Czech, English, French, German, and many more. Our treebanks differ from the previous treebanks in two important aspects: they are multi-representational, i.e., they include several layers of representation from the initial design; and they cover two standardized registers that are often considered separate languages: Hindi and Urdu.

[1]  Martha Palmer,et al.  Can Semantic Roles Generalize Across Genres? , 2007, NAACL.

[2]  Martha Palmer,et al.  Empty Argument Insertion in the Hindi PropBank , 2012, LREC.

[3]  Martha Palmer,et al.  Analysis of the Hindi Proposition Bank using Dependency Structure , 2011, Linguistic Annotation Workshop.

[4]  Riyaz Ahmad Bhat,et al.  Dependency Treebank of Urdu and its Evaluation , 2012, LAW@ACL.

[5]  Fei Xia,et al.  Linguistic Phenomena, Analyses, and Representations: Understanding Conversion between Treebanks , 2011, IJCNLP.

[6]  Miriam Butt,et al.  Urdu Ezafe and the Morphology-Syntax Interface , 2008 .

[7]  Erhard W. Hinrichs,et al.  A Unified Representation for Morphological, Syntactic, Semantic, and Referential Annotations , 2005, FCA@ACL.

[8]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[9]  Fei Xia,et al.  Converting Dependency Structures to Phrase Structures , 2001, HLT.

[10]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[11]  Martha Palmer,et al.  Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee , 2010, LREC.

[12]  Martha Palmer,et al.  Propbank Frameset Annotation Guidelines Using a Dedicated Editor, Cornerstone , 2010, LREC.

[13]  Owen Rambow,et al.  Automatically Deriving Tectogrammatical Labels from Other Resources: A Comparison of Semantic Labels Across Frameworks , 2003, Prague Bull. Math. Linguistics.

[14]  Akshar Bharati,et al.  Natural language processing : a Paninian perspective , 1996 .

[15]  Nadir Durrani,et al.  Urdu Word Segmentation , 2010, NAACL.

[16]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[17]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[18]  Dipti Misra Sharma,et al.  Intra-Chunk Dependency Annotation : Expanding Hindi Inter-Chunk Annotated Treebank , 2012, LAW@ACL.

[19]  David R. Dowty Thematic proto-roles and argument selection , 1991 .

[20]  Archna Bhatia,et al.  Empty Categories in a Hindi Treebank , 2010, LREC.

[21]  Owen Rambow,et al.  Towards a Multi-Representational Treebank , 2008 .

[22]  Richard Johansson,et al.  Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[23]  Michael Collins,et al.  A Statistical Parser for Czech , 1999, ACL.

[24]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[25]  Andy Way,et al.  Automatic annotation of the Penn-treebank with LFG f-structureinformation , 2002 .

[26]  Ann Bies,et al.  The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[27]  Archna Bhatia,et al.  PropBank Annotation of Multilingual Light Verb Constructions , 2010, Linguistic Annotation Workshop.

[28]  Noam Chomsky A minimalist program for linguistic theory , 1992 .

[29]  Colin P. Masica The Indo-Aryan Languages , 1991 .

[30]  Martha Palmer,et al.  Adding predicate argument structure to the Penn TreeBank , 2002 .

[31]  Martha Palmer,et al.  Semantic Roles for Nominal Predicates: Building a Lexical Resource , 2013, MWE@NAACL-HLT.

[32]  Dipti Misra Sharma,et al.  Dependency Annotation Scheme for Indian Languages , 2008, IJCNLP.