Audio segment retrieval using a short duration example query

We propose a general approach to audio segment retrieval using a synthesized HMM. The approach allows a user to query audio data by an example audio segment of a short duration and find similar segments. The basic idea of our approach is to first train a theme HMM using the given example and a general background HMM using all the audio data, and then combine these individual HMMs to form a synthesized "background-theme-background" HMM. This synthesized HMM can then be applied to any audio stream as a parser to detect the most likely theme segment. We overcome the problem of a short duration being used to train a theme HMM, by using the MAP rule with the background model as a prior model. Evaluation of the proposed retrieval scheme, using short duration example audio clips of narration as queries, gives quite promising results.

[1]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[2]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[3]  C.-C. Jay Kuo,et al.  Hierarchical classification of audio data for archiving and retrieving , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[5]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.

[6]  Qian Huang,et al.  Content-based indexing and retrieval-by-example in audio , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[7]  Atsuo Yoshitaka,et al.  A Survey on Content-Based Retrieval for Multimedia Databases , 1999, IEEE Trans. Knowl. Data Eng..