Use of transforms for indexing in audio databases

The phenomenal increases in the amounts of audio data being generated, processed and used in several computer applications have necessitated the development of audio database systems with newer features, such as content-based queries and similarity searches, to manage and use such data. Fast and accurate retrieval for content-based queries is crucial in order for such systems to be useful. Efficient content-based indexing and similarity searching schemes are keys to providing fast and relevant data retrieval. This paper studies and evaluates the different parameters involved in an indexing scheme which uses transforms (as used in signal processing) for generating the audio data indexes which are to be used in searching and retrieval. A comparison of the performance of the indexing scheme with different parameters is presented.

[1]  S. R. Subramanya,et al.  Transform-based indexing of audio data for multimedia databases , 1997, Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[2]  Wenjun Zeng,et al.  Integrated image and speech analysis for content-based video indexing , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[3]  Stephen W. Smoliar,et al.  Content based video indexing and retrieval , 1994, IEEE MultiMedia.

[4]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[5]  Victor Zue,et al.  A procedure for automatic alignment of phonetic transcriptions with continuous speech , 1984, ICASSP.

[6]  Jan P. van Hemert,et al.  Automatic segmentation of speech , 1991, IEEE Trans. Signal Process..

[7]  Don Kimber,et al.  Acoustic Segmentation for Audio Browsers , 1997 .

[8]  Marc R. D'Alleyrand Handbook of Image Storage and Retrieval Systems , 1992 .

[9]  Edoardo Ardizzone,et al.  JACOB: just a content-based query system for video databases , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[10]  Lawrence R. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1975, Bell Syst. Tech. J..

[11]  L. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1974, The Bell System Technical Journal.

[12]  Philip Ogunbona,et al.  Similarity measures for compressed image databases , 1997, TENCON '97 Brisbane - Australia. Proceedings of IEEE TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications (Cat. No.97CH36162).

[13]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[14]  Ramesh C. Jain,et al.  Digital video segmentation , 1994, MULTIMEDIA '94.

[15]  Vijay V. Raghavan,et al.  Content-Based Image Retrieval Systems - Guest Editors' Introduction , 1995, Computer.