Bird Sound Recognition Using a Convolutional Neural Network

Convolutional neural networks (CNNs) are powerful toolkits of machine learning which have proven efficient in the field of image processing and sound recognition. In this paper, a CNN system classifying bird sounds is presented and tested through different configurations and hyperparameters. The MobileNet pre-trained CNN model is fine-tuned using a dataset acquired from the Xeno-canto bird song sharing portal, which provides a large collection of labeled and categorized recordings. Spectrograms generated from the downloaded data represent the input of the neural network. The attached experiments compare various configurations including the number of classes (bird species) and the color scheme of the spectrograms. Results suggest that choosing a color map in line with the images the network has been pre-trained with provides a measurable advantage. The presented system is viable only for a low number of classes.

[1]  Karol J. Piczak Environmental sound classification with convolutional neural networks , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[2]  Stefan Kahl,et al.  Large-Scale Bird Sound Classification using Convolutional Neural Networks , 2017, CLEF.

[3]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[4]  Olivier Buisson,et al.  Shared Nearest Neighbors Match Kernel for Bird Songs Identification - LifeCLEF 2015 Challenge , 2015, CLEF.

[5]  Tuomas Virtanen,et al.  Convolutional recurrent neural networks for bird audio detection , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[6]  Karol J. Piczak Recognizing Bird Species in Audio Recordings using Deep Convolutional Neural Networks , 2016, CLEF.

[7]  Hervé Glotin,et al.  LifeCLEF 2017 Lab Overview: Multimedia Species Identification Challenges , 2017, CLEF.

[8]  Gaël Richard,et al.  Leveraging deep neural networks with nonnegative representations for improved environmental sound classification , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).

[9]  Justin Salamon,et al.  Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.

[10]  Thomas Hofmann,et al.  Audio Based Bird Species Identification using Deep Learning Techniques , 2016, CLEF.

[11]  Mario Lasseck,et al.  Improved Automatic Bird Identification through Decision Tree based Feature Selection and Bagging , 2015, CLEF.

[12]  Martin J. Russell,et al.  Bird species recognition using HMM-based unsupervised modelling of individual syllables with incorporated duration modelling , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[14]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[15]  Sven Koitka,et al.  Recognizing Bird Species in Audio Files Using Transfer Learning , 2017, CLEF.

[16]  Jont B. Allen,et al.  Short term spectral analysis, synthesis, and modification by discrete Fourier transform , 1977 .

[17]  Dan Stowell,et al.  Audio-only Bird Classification Using Unsupervised Feature Learning , 2014, CLEF.

[18]  Nathan Silberman,et al.  TF-Slim: A Lightweight Library for Defining, Training and Evaluating Complex Models in TensorFlow , 2017 .

[19]  Thomas Lidy,et al.  A Multi-modal Deep Neural Network approach to Bird-song Identication , 2017, CLEF.