Retrieval from spoken documents using content and speaker information

There has been a recent upsurge in the deployment of emerging technologies such as speech and speaker recognition which are reaching maturity. We discuss the details of the components required to build a system for audio indexing and retrieval for spoken documents using content and speaker based information facilitated by speech and speaker recognition. The real power of spoken document analysis is in using both content and speaker information together in retrieval by combining the results. The experiments described here are in the broadcast news domain, but the underlying techniques can easily be extended to other speech-centric applications and transactions.

[1]  Peter Schäuble,et al.  New techniques for open-vocabulary spoken document retrieval , 1998, SIGIR '98.

[2]  Salim Roukos,et al.  Audio-Indexing For Broadcast News , 1998, TREC.

[3]  Gerald Salton,et al.  Automatic text processing , 1988 .

[4]  Michael Picheny,et al.  Robust methods for using context-dependent features and models in a continuous speech recognizer , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Lalit R. Bahl,et al.  A tree search strategy for large-vocabulary continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[6]  S. Chen,et al.  Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion , 1998 .

[7]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[8]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[9]  Homayoon S. M. Beigi,et al.  Ibm Model-Based And Frame-By-Frame Speaker-Recognition , 1998 .

[10]  Stéphane H. Maes,et al.  A distance measure between collections of distributions and its application to speaker recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[11]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..