论文信息 - Recent improvements of an auditory model based front-end for the transcription of vocal queries

Recent improvements of an auditory model based front-end for the transcription of vocal queries

In this paper recent improvements of an existing acoustic frontend for the transcription of vocal (hummed, sung) musical queries is presented. Thanks to the addition of a new second pitch extractor and the introduction of a novel multi-stage segmentation algorithm, the application domain of the front-end could be extended to whistled queries, and on top of that, the performance on the other two query types could be improved. Experiments have shown that the new system can transcribe vocal queries with an accuracy ranging from 76 % (whistling) to 85 % (humming), and that it clearly outperforms other state-of-the art systems on all three query types.

[1] J P Martens,et al. Pitch and voiced/unvoiced determination with an auditory model. , 1992, The Journal of the Acoustical Society of America.

[2] Steffen Pauws,et al. CubyHum: a fully operational "query by humming" system , 2002, ISMIR.

[3] Thorsten Heinz,et al. Using a Physiological Ear Model for Automatic Melody Transcription and Sound Source Recognition , 2003 .

[4] Shlomo Dubnov,et al. Robust temporal and spectral modeling for query By melody , 2002, SIGIR '02.

[5] Ian H. Witten,et al. The New Zealand Digital Library MELody inDEX , 1997, D Lib Mag..

[6] Masashi Yamamuro,et al. A practical query-by-humming system for a large music database , 2000, ACM Multimedia.

[7] Marc Leman,et al. An Auditory Model Based Transcriber of Singing Sequences , 2002, ISMIR.

[8] Jean-Gabriel Ganascia,et al. Musical content-based retrieval: an overview of the Melodiscov approach and system , 1999, MULTIMEDIA '99.

[9] E. Terhardt,et al. Algorithm for extraction of pitch and pitch salience from complex tonal signals , 1982 .