论文信息 - Podcastle: a web 2.0 approach to speech recognition research

Podcastle: a web 2.0 approach to speech recognition research

In this paper, we describe a public web service, “PodCastle”, that provides full-text searching of Japanese podcasts on the basis of automatic speech recognition. This is an instance of our research approach, “Speech Recognition Research 2.0”, which is aimed at providing users with a web service based on Web 2.0 so that they can experience state-of-the-art speech recognition performance, and at promoting speech recognition technologies in cooperation with anonymous users. PodCastle enables users to find podcasts that include a search term, read full texts of their recognition results, and easily correct recognition errors. The results of the error correction can then be used to improve the performance of both full-text search and speech recognition. Although we know of no state-of-the-art speech recognizer that can successfully transcribe all of the various kinds of podcasts, the mechanism we propose will gradually increase the usefulness and applicability of PodCastle.

Masataka Goto | Jun Ogata | Kouichirou Eto

[1] Beth Logan,et al. Speechbot: an experimental speech-based search engine for multimedia content on the web , 2002, IEEE Trans. Multim..

[2] Lin-shan Lee,et al. Spoken document understanding and organization , 2005, IEEE Signal Processing Magazine.

[3] Andreas Stolcke,et al. Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[4] Tim O'Reilly,et al. What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software , 2007 .

[5] Julia Hirschberg,et al. SCAN: designing and evaluating user interfaces to support retrieval from speech archives , 1999, SIGIR '99.

[6] Kiyohiro Shikano,et al. Julius - an open source real-time large vocabulary recognition engine , 2001, INTERSPEECH.

[7] Yasuo Ariki,et al. An efficient lexical tree search for large vocabulary continuous speech recognition , 2000, INTERSPEECH.

[8] Masataka Goto,et al. Speech repair: quick error correction just by using selection operation for speech input interfaces , 2005, INTERSPEECH.

[9] Masataka Goto,et al. Automatic transcription for a web 2.0 service to search podcasts , 2007, INTERSPEECH.