Rapid bird activity detection using probabilistic sequence kernels

Bird activity detection is the task of determining if a bird sound is present in a given audio recording. This paper describes a bird activity detector which utilises a support vector machine (SVM) with a dynamic kernel. Dynamic kernels are used to process sets of feature vectors having different cardinalities. Probabilistic sequence kernel (PSK) is one such dynamic kernel. The PSK converts a set of feature vectors from a recording into a fixed-length vector. We propose to use a variant of PSK in this work. Before computing the fixed-length vector, cepstral mean and variance normalisation and short-time Gaussianization is performed on the feature vectors. This reduces environment mismatch between different recordings. Additionally, we also demonstrate a simple procedure to speed up the proposed method by reducing the size of fixed-length vector. A speedup of almost 70% is observed, with a very small drop in accuracy. The proposed method is also compared with a random forest classifier and is shown to outperform it.

[1]  Hervé Glotin,et al.  Bird detection in audio: A survey and a challenge , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[2]  Srinivasan Umesh,et al.  Improved cepstral mean and variance normalization using Bayesian framework , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[3]  Padmanabhan Rajan,et al.  Model-based unsupervised segmentation of birdcalls from field recordings , 2016, 2016 10th International Conference on Signal Processing and Communication Systems (ICSPCS).

[4]  Padmanabhan Rajan,et al.  Bird Call Identification Using Dynamic Kernel Based Support Vector Machines and Deep Neural Networks , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[5]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[6]  B. Furnas,et al.  Using automated recorders and occupancy models to monitor common forest birds across a large geographic region , 2015 .

[7]  Abraham L Borker,et al.  Vocal Activity as a Low Cost and Scalable Index of Seabird Colony Size , 2014, Conservation biology : the journal of the Society for Conservation Biology.

[8]  T. Scott Brandes,et al.  Automated sound recording and analysis techniques for bird surveys and conservation , 2008, Bird Conservation International.

[9]  Haizhou Li,et al.  A GMM-based probabilistic sequence kernel for speaker verification , 2007, INTERSPEECH.

[10]  Chellu Chandra Sekhar,et al.  GMM-Based Intermediate Matching Kernel for Classification of Varying Length Patterns of Long Duration Speech Using Support Vector Machines , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Ramesh A. Gopinath,et al.  Short-time Gaussianization for robust speaker verification , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.