Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection

Bird sound detection from real-field recordings is essential for identifying bird species in bioacoustic monitoring. Variations in the recording devices, environmental conditions, and the presence of vocalizations from other animals make the bird sound detection very challenging. In order to overcome these challenges, we propose an unsupervised algorithm comprising two main stages. In the first stage, a spectrogram enhancement technique is proposed using a multiple window Savitzky–Golay (MWSG) filter. We show that the spectrogram estimate using MWSG filter is unbiased and has lower variance compared with its single window counterpart. It is known that bird sounds are highly structured in the time–frequency (T–F) plane. We exploit these cues of prominence of T-F activity in specific directions from the enhanced spectrogram, in the second stage of the proposed method, for bird sound detection. In this regard, we use a set of four moving average filters that when applied to the enhanced spectrogram, yield directional spectrograms that capture the direction specific information. We propose a thresholding scheme on the time varying energy profile computed from each of these directional spectrograms to obtain frame-level binary decisions of bird sound activity. These individual decisions are then combined to obtain the final decision. Experiments are performed with three different datasets, with varying recording and noise conditions. Frame level F-score is used as the evaluation metric for bird sound detection. We find that the proposed method, on average, achieves higher F-score ($10.24\%$ relative) compared to the best of the six baseline schemes considered in this work.

[1]  Andreas Stolcke,et al.  Acoustic front-end optimization for bird species recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Grigorios Tsoumakas,et al.  The 9th annual MLSP competition: New methods for acoustic classification of multiple simultaneous bird species in a noisy environment , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[3]  Raviv Raich,et al.  The Ninth Annual MLSP Data Competition , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[4]  Klaus Riede,et al.  Automatic bird sound detection in long real-field recordings: Applications and tools , 2014 .

[5]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[6]  T. Ricketts,et al.  Confronting a biome crisis: global disparities of habitat loss and protection , 2004 .

[7]  Nikos Fakotakis,et al.  Acoustic Monitoring of Singing Insects , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  Aki Härmä Automatic identification of bird species based on sinusoidal modeling of syllables , 2003, ICASSP.

[9]  Michael T. Johnson,et al.  Automatic classification and speaker identification of African elephant (Loxodonta africana) vocalizations. , 2003 .

[10]  Todor Ganchev,et al.  Bird acoustic activity detection based on morphological filtering of the spectrogram , 2015 .

[11]  Luis J. Villanueva-Rivera,et al.  Using Automated Digital Recording Systems as Effective Tools for the Monitoring of Birds and Amphibians , 2006 .

[12]  Xiaoli Z. Fern,et al.  Time-frequency segmentation of bird song in noisy acoustic environments , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  T. Mitchell Aide,et al.  Real-time bioacoustics monitoring and automated species identification , 2013, PeerJ.

[14]  Ronald W. Schafer,et al.  What Is a Savitzky-Golay Filter? [Lecture Notes] , 2011, IEEE Signal Processing Magazine.

[15]  I. Potamitis Automatic Classification of a Taxon-Rich Community Recorded in the Wild , 2014, PloS one.

[16]  Ying Li,et al.  Automatic Recognition of Bird Songs Using Time-Frequency Texture , 2013, 2013 5th International Conference on Computational Intelligence and Communication Networks.

[17]  Chin-Chuan Han,et al.  Automatic Classification of Bird Species From Their Sounds Using Two-Dimensional Cepstral Coefficients , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Michael W. Towsey,et al.  Similarity-based birdcall retrieval from environmental audio , 2015, Ecol. Informatics.

[19]  Christopher W. Clark,et al.  The Physical Acoustics of Underwater Sound Communication , 2003 .

[20]  Mario Lasseck Large-scale Identification of Birds in Audio Recordings , 2014, CLEF.

[21]  Michael Unser,et al.  B-spline snakes: a flexible tool for parametric contour detection , 2000, IEEE Trans. Image Process..

[22]  Andreas Rauber,et al.  LifeCLEF Bird Identification Task 2017 , 2017, CLEF.

[23]  Frank Kurth,et al.  Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring , 2010, Pattern Recognit. Lett..

[24]  Abeer Alwan,et al.  A robust automatic bird phrase classifier using dynamic time-warping with prominent region identification , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25]  Panu Somervuo,et al.  Parametric Representations of Bird Sounds for Automatic Species Recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Hervé Glotin,et al.  Scattering Decomposition for Massive Signal Classification: From Theory to Fast Algorithm and Implementation with Validation on International Bioacoustic Benchmark , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[27]  O. Phillips,et al.  Extinction risk from climate change , 2004, Nature.

[28]  Paul Roe,et al.  Generalised features for bird vocalisation retrieval in acoustic recordings , 2015, 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP).

[29]  Tien C. Bau,et al.  Using Two-Dimensional Gabor Filters for Handwritten Digit Recognition , 2008 .

[30]  Gábor Fodor The Ninth Annual MLSP Competition: First place , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[31]  Chin-Chuan Han,et al.  Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis , 2006, Pattern Recognit. Lett..

[32]  Fagerlund,et al.  Automatic recognition of Bird Species by Their Sound , 2022 .

[33]  Joakim Andén,et al.  Joint time-frequency scattering for audio classification , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[34]  Abeer Alwan,et al.  A sparse representation-based classifier for in-set bird phrase verification and classification with limited training data , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[35]  R. Schafer,et al.  What Is a Savitzky-Golay Filter? , 2022 .

[36]  Abeer Alwan,et al.  Bird phrase segmentation by entropy-driven change point detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[37]  T. S. Brandes,et al.  Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[38]  E. D. Chesmore,et al.  Acoustic methods for the automated detection and identification of insects , 2001 .

[39]  Peter Marler,et al.  Bird Calls: Their Potential for Behavioral Neurobiology , 2004, Annals of the New York Academy of Sciences.