Semi-automatic evaluation of the grammatical coverage of machine translation systems

In this paper we present a methodology for automating the evaluation of the grammatical coverage of machine translation (MT) systems. The methodology is based on the importance of unfolded grammatical structures, which represent the most basic syntactic pattern for a sentence in a given language. A database of unfolded grammatical structures is built to evaluate the parser of any NLP or MT system. The evaluation results in an overall measure called the grammatical coverage. The results of implementing the above approach on three English-to-Arabic commercial MT systems are presented.