Issues with uncertainty decoding for noise robust speech recognition

Recently there has been interest in uncertainty decoding for robust speech recognition. Here the uncertainty associated with the observation in noise is propagated to the recogniser. By using appropriate approximations for this uncertainty, it is possible to obtain efficient implementations during decoding. The aim of these schemes is to obtain performance which is close to that of a modelbased compensated system, without the computational cost. Unfortunately, in low SNR there isa fundamental issue withfront-end uncertainty decoding where the model means and variances are updated according to the features. This is described in detail using the Joint and SPLICE with uncertainty forms, but is not limited to these two techniques. A solution for the Joint scheme is presented along with the implicit approach used in SPLICE with uncertainty. In addition, a model-based Joint uncertainty scheme is described, which is more efficient and powerful than the front-end schemes, and being model-based not affected by this problem. This issue is illustrated using the AURORA 2.0 database with these various systems. Index Terms: model-based noise compensation, robust speech recognition, uncertainty decoding.