Environmental sniffing: robust digit recognition for an in-vehicle environment

In this paper, we propose to integrate an Environmental Sniffing [1] framework, into an in-vehicle hands-free digit recognition task. The framework of Environmental Sniffing is focused on detection, classification and tracking changing acoustic environments. Here, we extend the framework to detect and track acoustic environmental conditions in a noisy-speech audio stream. Knowledge extracted about the acoustic environmental conditions is used to determine which environment dependent acoustic model to use. Critical Performance Rate (CPR), previously considered in [1], is formulated and calculated for this task. The sniffing framework is compared to a ROVER solution for automatic speech recognition (ASR) using different noise conditioned recognizers in terms of Word Error Rate (WER) and CPU usage. Results show that the model matching scheme using the knowledge extracted from the audio stream by Environmental Sniffing does a better job than a ROVER solution both in accuracy and computation. A relative 11.1% WER improvement is achieved with a relative 75% reduction in CPU resources.