Free on-line speech recogniser based on Kaldi ASR toolkit producing word posterior lattices

This paper presents an extension of the Kaldi automatic speech recognition toolkit to support on-line recognition. The resulting recogniser supports acoustic models trained using state-of-theart acoustic modelling techniques. As the recogniser produces word posterior lattices, it is particularly useful in statistical dialogue systems, which try to exploit uncertainty in the recogniser’s output. Our experiments show that the online recogniser performs significantly better in terms of latency when compared to a cloud-based recogniser.

[1]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[2]  Gabriel Skantze,et al.  Incremental Dialogue Processing in a Micro-Domain , 2009, EACL.

[3]  Hermann Ney,et al.  RASR - The RWTH Aachen University Open Source Speech Recognition Toolkit , 2011 .

[4]  Ngoc Thang Vu,et al.  Generating exact lattices in the WFST framework , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Anton Leuski,et al.  Which ASR should I choose for my dialogue system? , 2013, SIGDIAL Conference.