Adaptive energy detection for bird sound detection in complex environments

A new bird sound classification approach based on adaptive energy detection was proposed to improve the recognition accuracy of bird sounds in noisy environments. In this paper, the bird sounds with background noises were divided into three linear frequency bands according to their frequency distribution in spectrogram. The noise spectrum of each band was estimated and the existent probability of the foreground bird sound for each band was computed to serve for the adaptive threshold of energy detection. These foreground bird sound signals were detected and selected via adaptive energy detection from the bird sounds with background noises. Then, the features of Mel-scaled Wavelet packet decomposition Sub-band Cepstral Coefficient (MWSCC) and Mel-Frequency Cepstral Coefficient (MFCC) were extracted from the above signals for classification by using the classifier of Support Vector Machine (SVM), respectively. Moreover, the differences of recognition performance were implemented on 30 kinds of bird sounds at different Signal-to-Noise Ratios (SNRs) under different noisy environments, before or after adaptive energy detection. The results show that MWSCC has better noise immunity function, and the recognition performance after adaptive energy detection improves more significantly, indicating that it is a very suitable approach for the bird sound recognition in complex environments.

[1]  Gonzalo Seco-Granados,et al.  A Reduced Complexity Approach to IAA Beamforming for Efficient DOA Estimation of Coherent Sources , 2011, EURASIP J. Adv. Signal Process..

[2]  Liqiang Ji,et al.  A call-independent and automatic acoustic system for the individual recognition of animals: A novel model using four passerines , 2010, Pattern Recognit..

[3]  Daniela Mercedes Martínez Plata,et al.  Evaluation of energy detection for spectrum sensing based on the dynamic selection of detection-threshold , 2012 .

[4]  Gao Yong Speech Enhancement Algorithm with Leading-in Delay , 2011 .

[5]  Héctor Corrada Bravo,et al.  Automated classification of bird and amphibian calls using machine learning: A comparison of methods , 2009, Ecol. Informatics.

[6]  Peter Jancovic,et al.  Automatic Detection and Recognition of Tonal Bird Sounds in Noisy Environments , 2011, EURASIP J. Adv. Signal Process..

[7]  Bo Xu,et al.  SVM-based audio scene classification , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[8]  Philipos C. Loizou,et al.  A noise-estimation algorithm for highly non-stationary environments , 2006, Speech Commun..

[9]  A. Ghasemi,et al.  Collaborative spectrum sensing for opportunistic access in fading environments , 2005, First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, 2005. DySPAN 2005..

[10]  Aki Härmä Automatic identification of bird species based on sinusoidal modeling of syllables , 2003, ICASSP.

[11]  Yonghong Zeng,et al.  Sensing-Throughput Tradeoff for Cognitive Radio Networks , 2008, IEEE Trans. Wirel. Commun..

[12]  John S. D. Mason,et al.  A comparison of composite features under degraded speech in speaker recognition , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  William M. Campbell,et al.  Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..

[14]  Mark A Gregory,et al.  A novel approach for MFCC feature extraction , 2010, 2010 4th International Conference on Signal Processing and Communication Systems.

[15]  Xiaoli Z. Fern,et al.  Audio Classification of Bird Species: A Statistical Manifold Approach , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[16]  Juha T. Tanttu,et al.  Wavelets in Recognition of Bird Sounds , 2007, EURASIP J. Adv. Signal Process..

[17]  B. Venkataramani,et al.  Study and evaluation of a multi-class SVM classifier using diminishing learning technique , 2010, Neurocomputing.

[18]  Jian-Da Wu,et al.  Speaker identification using discrete wavelet packet transform technique with irregular decomposition , 2009, Expert Syst. Appl..

[19]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[20]  R.W. Brodersen,et al.  Spectrum Sensing Measurements of Pilot, Energy, and Collaborative Detection , 2006, MILCOM 2006 - 2006 IEEE Military Communications conference.

[21]  H. Urkowitz Energy detection of unknown deterministic signals , 1967 .

[22]  Robert I. Damper,et al.  Signal theory for SVM kernel design with applications to parameter estimation and sequence kernels , 2008, Neurocomputing.

[23]  J. Atwell,et al.  On the relation between loudness and the increased song frequency of urban birds , 2011, Animal Behaviour.

[24]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[25]  Panu Somervuo,et al.  Parametric Representations of Bird Sounds for Automatic Species Recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Addisson Salazar,et al.  Detection of signals of unknown duration by multiple energy detectors , 2010, Signal Process..

[27]  Douglas D. O'Shaughnessy,et al.  Compensated mel frequency cepstrum coefficients , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[28]  Frank Kurth,et al.  Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring , 2010, Pattern Recognit. Lett..

[29]  Wei Chu,et al.  Noise robust bird song detection using syllable pattern-based hidden Markov models , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).