Paninian Grammar Framework Applied to English
暂无分享,去创建一个
Published in South Asian Language Review, Creative Books, New Delhi, 1998.] Computational Paninian Grammar framework (PG) has been successfully applied to modern Indian languages earlier, using which anusaaraka machine translation system has been built (Narayana, 1994). In this paper, we show that PG can also be applied to English resulting in an elegant computational grammar. First, we generalize the notion of vibhakti to include position of the word in a sentence along with its case and associated preposition, if any. This allows us to use the familiar PG notions of karaka chart, karaka chart transformation, and sharing rules (Bharati et al., 1995) to account for the English actives and passives, lexical control, infinitives, etc. A transformation of the karaka chart and the vibhakti therein, very naturally accounts for what is called movement. Second, we introduce a new vibhakti called TOPIC position (which corresponds to the first position in a clause) and a new operation called join for connecting a relative clause to its head. These two together handle long distance dependency in relative clauses and wh-questions, raising, tough-movement, pied-piping, etc. The karakas with TOPIC vibhakti appear at the beginning of the clause. This paper establishes that PG is more general than hitherto considered, and can be used to explain not just free word order languages but also positional languages. Further reseearch is continuing on this and related aspects. 1 PG for Indian Languages A Review The Paninian framework considers information as central to the study of language. When a writer (or a speaker) uses language to convey some information to the reader (or the hearer), he codes the information in the language string. Similarly, when a reader (or a hearer) receives a language string, he extracts the information coded in it. The computational Paninian Grammar framework (PG) is primarily concerned with: how the information is coded and how it can be extracted. Two levels of representation can be readily seen in langauge use: One, the actual language string (or sentence), two, what the speaker has in his mind. The latter can also be called as the meaning. Paninian framework has two other important levels: karaka level and vibhakti level (Figure 1). The surface level is the uttered or the written sentence. The vibhakti level is the level at which there are local word groups based on case endings, preposition or postposition markers. By string we mean any of: word, phrase, sentence, paragraph, etc. For positional languages such as English, it would also include position or word order information.