Advanced training methods and new network topologies for hybrid MMI-connectionist/HMM speech recognition systems

This paper deals with the construction and optimization of a hybrid speech recognition system that consists of a combination of a neural vector quantizer (VQ) and discrete HMMs. In our investigations an integration of VQ based classification in the continuous classifier framework is given and some constraints are derived that must hold for the PDFs in the discrete pattern classifier context. Furthermore it is shown that for ML training of the whole system the VQ parameters must be estimated according to the maximum mutual information (MMI) criterion. A novel training method based on gradient search for neural networks that serve as optimal VQ is derived. This allows faster training of arbitrary network topologies compared to the traditional MMI-NN training. An integration of multilayer MMI-NNs as the VQ in the hybrid discrete HMM based speech recognizer leads to a large improvement compared to other supervised and unsupervised single layer VQ systems. For the speaker independent Resource Management database the constructed hybrid MMI-connectionist/HMM system achieves recognition rates that are comparable to traditional sophisticated continuous PDF HMM systems.