论文信息 - Discrimination and retrieval of animal sounds

Discrimination and retrieval of animal sounds

Until recently few research has been performed in the area of animal sound retrieval. The authors identify state-of-the-art techniques in general purpose sound recognition by a broad survey of literature. Based on the findings, this paper gives a thorough investigation of audio features and classifiers and their applicability in the domain of animal sounds. We introduce a set of novel audio descriptors and compare their quality to other popular features. The results are encouraging and motivate further research in this domain

[1] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[2] Teuvo Kohonen,et al. Self-Organizing Maps , 2010 .

[3] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[4] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[5] Michael J. Carey,et al. A comparison of features for speech, music discrimination , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[6] Yi-Ping Phoebe Chen,et al. The power of play-break for automatic detection and browsing of self-consumable sport video highlights , 2004, MIR '04.

[7] Thomas Sikora,et al. Audio classification based on MPEG-7 spectral basis representations , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[8] Elias Pampalk. A Matlab Toolbox to Compute Music Similarity from Audio , 2004, ISMIR.

[9] Vladimir Cherkassky,et al. The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[10] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[11] Kiyoharu Aizawa. Digitizing Personal Experiences: Capture and Retrieval of Life Log , 2005, MMM.

[12] Michael A. Cowling,et al. Non-Speech Environmental Sound Classification System for Autonomous Surveillance , 2004 .

[13] J. Davenport. Editor , 1960 .

[14] Tsuhan Chen,et al. Audio feature extraction and analysis for scene classification , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[15] Alexander Dekhtyar,et al. Information Retrieval , 2018, Lecture Notes in Computer Science.

[16] David G. Stork,et al. Pattern classification, 2nd Edition , 2000 .

[17] Satosi Watanabe,et al. Pattern Recognition: Human and Mechanical , 1985 .

[18] Douglas Keislar,et al. Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[19] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[20] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[21] Xuejing Sun,et al. Pitch determination and voice quality analysis using Subharmonic-to-Harmonic Ratio , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22] J. Hadamard. Sur les problemes aux derive espartielles et leur signification physique , 1902 .

[23] Dragutin Petkovic,et al. Query by Image and Video Content: The QBIC System , 1995, Computer.

[24] B. P. Bogert,et al. The quefrency analysis of time series for echoes : cepstrum, pseudo-autocovariance, cross-cepstrum and saphe cracking , 1963 .

[25] M. Aizerman,et al. Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[26] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .

[27] Kyu-Sik Park,et al. Acoustic intruder detection system for home security , 2005, 2005 Digest of Technical Papers. International Conference on Consumer Electronics, 2005. ICCE..

[28] B. Feiten,et al. Automatic indexing of a sound database using self-organizing neural nets , 1994 .

[29] P. C. Pandey,et al. The Journal of the Acoustical Society of America , 1939 .

[30] Jonathan Foote,et al. Content-based retrieval of music and audio , 1997, Other Conferences.

[31] Ingrid Daubechies,et al. The wavelet transform, time-frequency localization and signal analysis , 1990, IEEE Trans. Inf. Theory.

[32] Judith C. Brown. Calculation of a constant Q spectral transform , 1991 .

[33] M. Lamming,et al. "Forget-me-not" Intimate Computing in Support of Human Memory , 1994 .

[34] Teuvo Kohonen,et al. Learning vector quantization , 1998 .

[35] Youngmoo E. Kim,et al. Musical instrument identification: A pattern‐recognition approach , 1998 .

[36] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[37] Ronald W. Schafer,et al. Digital Processing of Speech Signals , 1978 .

[38] C.-C. Jay Kuo,et al. Audio content analysis for online audiovisual data segmentation and classification , 2001, IEEE Trans. Speech Audio Process..

[39] Kaamran Raahemifar,et al. Content based audio classification and retrieval using joint time-frequency analysis , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[40] Guodong Guo,et al. Content-based audio classification and retrieval by support vector machines , 2003, IEEE Trans. Neural Networks.

[41] J. Tukey,et al. An algorithm for the machine calculation of complex Fourier series , 1965 .

[42] Anil C. Kokaram,et al. A Wavelet Packet representation of audio signals for music genre classification using different ensemble and feature selection techniques , 2003, MIR '03.

[43] Brian Christopher Smith,et al. Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[44] Mohan S. Kankanhalli,et al. Content-based music structure analysis with applications to music semantics understanding , 2004, MULTIMEDIA '04.

[45] Anoop Gupta,et al. Automatically extracting highlights for TV Baseball programs , 2000, ACM Multimedia.

[46] B. Kedem,et al. Spectral analysis and discrimination by zero-crossings , 1986, Proceedings of the IEEE.

[47] Changsheng Xu,et al. Audio keyword generation for sports video analysis , 2004, MULTIMEDIA '04.

[48] Y.K. Muthusamy,et al. Reviewing automatic language identification , 1994, IEEE Signal Processing Magazine.

[49] Chunru Wan,et al. Feature selection for automatic classification of musical instrument sounds , 2001, JCDL '01.

[50] Malcolm Slaney,et al. Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[51] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[52] C.-C. Jay Kuo,et al. Hierarchical classification of audio data for archiving and retrieving , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[53] Horst M. Eidenberger. New perspective on visual information retrieval , 2003, IS&T/SPIE Electronic Imaging.