Impact of word error rate on driving performance while dictating short texts

This paper describes the impact of speech recognition word error rate (WER) on driver's distraction in the context of short message dictation. A multi-modal dictation and error correction system was used in a simulated driving environment (Lane Change Test, LCT) to dictate text messages with prescribed semantic content. Driving accuracy was measured using several objective statistics produced by the LCT simulator. We report results for three datasets: 28 LCT trips by native US-English speakers at 40km/h, 23 more trips at 60km/h which had noise added in order to artificially increase WER levels and 22 LCT trips at 60km/h performed by non-native accented speakers. For the two datasets that used 60km/h we observed a moderate correlation between the driver's WER and driving performance statistics such as the mean deviation from ideal track (MDev) and the standard deviation of lateral position (SDLP). This correlation reached statistical significance for all of these statistics in the native dataset, and was significant for the overall SDLP in the non-native dataset. Additionally, we observed that higher WER levels lead to significantly lower message throughput and to significantly lower quality of sent messages, esp. for non-native speakers.