Quality-Sensitive Test Set Selection for a Speech Translation System

We propose a test set selection method to sensitively evaluate the performance of a speech translation system. The proposed method chooses the most sensitive test sentences by removing insensitive sentences iteratively. Experiments are conducted on the ATR-MATRIX speech translation system, developed at ATR Interpreting Telecommunications Research Laboratories. The results show the effectiveness of the proposed method. According to the results, the proposed method can reduce the test set size to less than 40% of the original size while improving evaluation reliability.