Comparison of machine learning methods applied to birdsong element classification

Songbirds provide neuroscience with a model system for understanding how the brain learns and produces a motor skill similar to speech. Much like humans, songbirds learn their vocalizations from social interactions during a critical period in development. Each bird’s song consists of repeated elements referred to as “syllables”. To analyze song, scientists label syllables by hand, but a bird can produce hundreds of songs a day, many more than can be labeled. Several groups have applied machine learning algorithms to automate labeling of syllables, but little work has been done comparing these various algorithms. For example, there are articles that propose using support vector machines (SVM), K-nearest neighbors (k-NN), and even deep learning to automate labeling song of the Bengalese Finch (a species whose behavior has made it the subject of an increasing number of neuroscience studies). This paper compares algorithms for classifying Bengalese Finch syllables (building on previous work [https://youtu.be/ghgniK4X_Js]). Using a standard crossvalidation approach, classifiers were trained on syllables from a given bird, and then classifier accuracy was measured with large hand-labeled testing datasets for that bird. The results suggest that both k-NN and SVM with a non-linear kernel achieve higher accuracy than a previously published linear SVM method. Experiments also demonstrate that the accuracy of linear SVM is impaired by "intro syllables", a low-amplitude high-noise syllable found in all Bengalese Finch songs. Testing of machine learning algorithms was carried out using Scikit-learn and Numpy/Scipy via Anaconda. Figures from this paper in Jupyter notebook form, as well as code and links to data, are here: https://github.com/NickleDave/ML-comparison-birdsong

[1]  S. Sober,et al.  Adult birdsong is actively maintained by error correction , 2009, Nature Neuroscience.

[2]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[3]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[4]  M. Brainard,et al.  Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong , 2007, Nature.

[5]  Richard Bertram,et al.  A statistical method for quantifying songbird phonology and syntax , 2008, Journal of Neuroscience Methods.

[6]  Michale S Fee,et al.  The songbird as a model for the generation and learning of complex sequential behaviors. , 2010, ILAR journal.

[7]  Naoya Oosugi,et al.  Semi-Automatic Classification of Birdsong Elements Using a Linear Support Vector Machine , 2014, PloS one.

[8]  J A Kogan,et al.  Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: a comparative study. , 1998, The Journal of the Acoustical Society of America.

[9]  Edward A. Stern,et al.  Birdbrains could teach basal ganglia research a new song , 2005, Trends in Neurosciences.

[10]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[11]  C. E. Ho,et al.  A procedure for an automated measurement of song similarity , 2000, Animal Behaviour.

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..