Towards an Annotated Corpus of Discourse Relations in Hindi

We describe our initial efforts towards developing a large-scale corpus of Hindi texts annotated with discourse relations. Adopting the lexically grounded approach of the Penn Discourse Treebank (PDTB), we present a preliminary analysis of discourse connectives in a small corpus. We describe how discourse connectives are represented in the sentence-level dependency annotation in Hindi, and discuss how the discourse annotation can enrich this level for research and applications. The ultimate goal of our work is to build a Hindi Discourse Relation Bank along the lines of the PDTB. Our work will also contribute to the cross-linguistic understanding of discourse connectives.