Feature learning for bird-call segmentation using phase based features

In this paper, we extend an existing algorithm for the segmentation of bird calls from the background. By utilizing features which has information about the magnitude and phase of the Fourier transform, we demonstrate an improvement in segmentation performance as compared to magnitude-only features. The proposed method utilizes a dictionary learnt from the time-frequency representation. The coefficients obtained by projecting a recording on to this dictionary is used to estimate Reney entropy between bird vocalization and background. The proposed method obtains an improvement of 25 percent as compared to similar features derived from conventional magnitude-based spectrogram.

[1]  Charles E Taylor,et al.  Automated species recognition of antbirds in a Mexican rainforest using hidden Markov models. , 2008, The Journal of the Acoustical Society of America.

[2]  Padmanabhan Rajan,et al.  Rényi entropy based mutual information for semi-supervised bird vocalization segmentation , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).

[3]  C. S. Seelamantula,et al.  Quantifying Vocal Mimicry in the Greater Racket-Tailed Drongo: A Comparison of Automated Methods and Human Assessment , 2014, PloS one.

[4]  Dan Stowell,et al.  Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning , 2014, PeerJ.

[5]  Satoshi Nakamura,et al.  Efficient representation of short-time phase based on group delay , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  Chin-Chuan Han,et al.  Automatic Classification of Bird Species From Their Sounds Using Two-Dimensional Cepstral Coefficients , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Juha T. Tanttu,et al.  Wavelets in Recognition of Bird Sounds , 2007, EURASIP J. Adv. Signal Process..

[8]  Kuldip K. Paliwal,et al.  Product of power spectrum and group delay function for speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Xiaoli Z. Fern,et al.  Time-frequency segmentation of bird song in noisy acoustic environments , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Seppo Ilmari Fagerlund,et al.  Bird Species Recognition Using Support Vector Machines , 2007, EURASIP J. Adv. Signal Process..

[11]  Padmanabhan Rajan,et al.  Model-based unsupervised segmentation of birdcalls from field recordings , 2016, 2016 10th International Conference on Signal Processing and Communication Systems (ICSPCS).

[12]  Xiaoli Z. Fern,et al.  A Syllable-Level Probabilistic Framework for Bird Species Identification , 2009, 2009 International Conference on Machine Learning and Applications.

[13]  Michael W. Towsey,et al.  Similarity-based birdcall retrieval from environmental audio , 2015, Ecol. Informatics.

[14]  Kuldip K. Paliwal,et al.  Short-time phase spectrum in speech processing: A review and some experimental results , 2007, Digit. Signal Process..

[15]  D Margoliash,et al.  Template-based automatic recognition of birdsong syllables from continuous recordings. , 1996, The Journal of the Acoustical Society of America.

[16]  Abeer Alwan,et al.  Bird phrase segmentation by entropy-driven change point detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  B. Yegnanarayana Formant extraction from linear‐prediction phase spectra , 1978 .

[18]  Thierry Dutoit,et al.  Chirp group delay analysis of speech signals , 2007, Speech Commun..

[19]  Panu Somervuo,et al.  Parametric Representations of Bird Sounds for Automatic Species Recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Andreas M. Ali,et al.  Acoustic monitoring in terrestrial environments using microphone arrays: applications, technological considerations and prospectus , 2011 .

[21]  Panu Somervuo,et al.  Classification of the harmonic structure in bird vocalization , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.