Spoken book alignment using WFSTs

The framework of this paper is a national project known as IPSOM, whose main goal is to improve the access to digitally stored spoken books, used primarily by the visually impaired community, by providing tools for easily detecting and indexing units (words, sentences, topics). Simultaneously, the project also aims to broaden the usage of multimedia spoken books (for instance in didactic applications, etc.), by providing multimedia interfaces for access and retrieval. Hence, spoken book alignment is a major task.

[1]  Andrej Ljolje,et al.  Full expansion of context-dependent networks in large vocabulary speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Hynek Hermansky,et al.  RASTA-PLP speech analysis technique , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Ciro Martins,et al.  A large vocabulary continuous speech recognition hybrid system for the portuguese language , 1998, ICSLP.

[4]  João Paulo da Silva Neto,et al.  Combination of acoustic models in continuous speech recognition hybrid systems , 2000, INTERSPEECH.

[5]  Steven Greenberg,et al.  Robust speech recognition using the modulation spectrogram , 1998, Speech Commun..