Some Improvements over the BLEU Metric for Measuring Translation Quality for Hindi

The BLEU translation quality evaluation metric is a modified n-gram precision measure that uses a number of reference translations of a given candidate text in order to evaluate its quality of translation. In this paper, we propose some modifications to this metric so that it suits Indian languages; more specifically, Hindi. The problem of using BLEU in Hindi presents two difficulties: non-availability of multiple references and prevalence of free word order in sentence construction. It is established that the validity of BLEU scores generally increases with the number of reference translations used. Further, Hindi being a free word order language, naive n-gram matching as adopted by BLEU does not accurately predict the quality of a translated text. In our approach we have modified BLEU in order to take care of the above-mentioned shortcomings. Our proposed metric has obtained a closer correlation with human judgment while using just one reference translation