GH-MAP: translation system for sibling language pair Gujarati--Hindi

India is a linguistically rich country having eighteen constitutional languages, which are written in ten different scripts. Indian languages are highly inflectional with a rich morphology, relatively free word order, and default sentence structure as subject object verb. Many of them are structurally similar called sibling languages. Hindi and Gujarati languages are such siblings. The paper briefly describes GH-MAP; a rule based token mapping system developed by us for translation between sibling language pair Hindi and Gujarati. GH-MAP system performs effective word-for-word translation using simple and computationally inexpensive methods and minimal lexical resources. Issues of syntactic, semantic and structural divergence in translation using GH-MAP are resolved with the help of special empirical rules. The aim of GH-MAP is not to produce high quality translation in the sense of linguistics; rather it has been developed to produce correct working translation sufficient to cross the language barrier. The system was evaluated on the test bed obtained from FIRE 2010, literature on Gandhiji and ELRA-W0037. For establishing relevance of the model, ‘into-Gujarati’ BLEU, PER and METEOR score have been calculated.