Books on tape as training data for continuous speech recognition

Abstract Training algorithms for natural speech recognition require very large amounts of transcribed speech data. Commercially distributed books on tape constitute an abundant source of such data, but it is difficult to take advantage of it using current training algorithms because of the requirement that the data be hand-segmented into chunks that can be comfortably processed in memory. In order to address this problem we have developed a training algorithm which is capable of handling unsegmented data files of arbitrary length; the computational requirements of the algorithm are linear in the amount of data to be processed and the memory requirements are constant.

[1]  Douglas D. O'Shaughnessy,et al.  HMM training on unconstrained speech for large vocabulary, continuous speech recognition , 1992, ICSLP.

[2]  Chin-Hui Lee,et al.  Implementation Aspects Of Large Vocabulary Recognition Based On Intraword And Interword Phonetic Units , 1990, HLT.

[3]  Douglas D. O'Shaughnessy,et al.  Experiments in continuous speech recognition with a 60, 000 word vocabulary , 1992, ICSLP.

[4]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[5]  Kai-Fu Lee,et al.  Context-independent phonetic hidden Markov models for speaker-independent continuous speech recognition , 1990 .

[6]  Douglas D. O'Shaughnessy,et al.  A*-admissible heuristics for rapid lexical access , 1993, IEEE Trans. Speech Audio Process..

[7]  Patrick Kenny,et al.  Phonemic hidden Markov models with continuous mixture output densities for large vocabulary word recognition , 1991, IEEE Trans. Signal Process..

[8]  James N. Siddall,et al.  Analytical decision-making in engineering design , 1972 .

[9]  Michael D. Brown,et al.  An algorithm for connected word recognition , 1982, ICASSP.

[10]  Janet M. Baker,et al.  The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.

[11]  Douglas D. O'Shaughnessy,et al.  Experiments in continuous speech recognition using books on tape , 1994, Speech Commun..

[12]  Roberto Pieraccini Speaker independent recognition of Italian telephone speech with mixture density hidden Markov models , 1991, Speech Commun..

[13]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[14]  John G. Proakis,et al.  Digital Communications , 1983 .

[15]  B. Juang,et al.  Context-dependent Phonetic Hidden Markov Models for Speaker-independent Continuous Speech Recognition , 2008 .