Bird species identification via transfer learning from music genres

Abstract Humans possess the ability to apply previously acquired knowledge to deal with novel problems quite efficiently. Transfer Learning is inspired by exactly that ability and has been proposed to handle cases where the available data come from diverse feature spaces and/or distributions. This paper proposes to transfer knowledge existing in music genre classification to identify bird species, motivated by the existing acoustic similarities. We propose a Transfer Learning framework exploiting the probability density distributions of ten different music genres for acquiring a degree of affinity between the bird species and each music genre. To this end, we exploit a feature space transformation based on Echo State Networks. The results reveal a consistent average improvement of 11.2% in the identification accuracy of ten European bird species.

[1]  Nikos Fakotakis,et al.  Acoustic Detection of Human Activities in Natural Environments , 2012 .

[2]  Thomas Grill,et al.  Two convolutional neural networks for bird detection in audio signals , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[3]  Seppo Ilmari Fagerlund,et al.  Bird Species Recognition Using Support Vector Machines , 2007, EURASIP J. Adv. Signal Process..

[4]  Stavros Ntalampiras,et al.  Directed Acyclic Graphs for Content Based Sound, Musical Genre, and Speech Emotion Classification , 2014 .

[5]  Ying Li,et al.  Specific environmental sounds recognition using time-frequency texture features and random forest , 2013, 2013 6th International Congress on Image and Signal Processing (CISP).

[6]  Benjamin Schrauwen,et al.  An experimental unification of reservoir computing methods , 2007, Neural Networks.

[7]  Matthew E. P. Davies,et al.  Transfer Learning In Mir: Sharing Learned Latent Representations For Music Audio Classification And Similarity , 2013, ISMIR.

[8]  Matthew Head,et al.  Birdsong and the Origins of Music , 1997, Journal of the Royal Musical Association.

[9]  Yin-Fu Huang,et al.  Music genre classification based on local feature selection using a self-adaptive harmony search algorithm , 2014, Data Knowl. Eng..

[10]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[11]  Benjamin Schrauwen,et al.  Reservoir-based techniques for speech recognition , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[12]  Ilyas Potamitis,et al.  Deep Networks tag the location of bird vocalisations on audio spectrograms , 2017, ArXiv.

[13]  Brad E. Lucas Why Birds Sing: A Journey Into the Mystery of Bird Song , 2006 .

[14]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[15]  Ilyas Potamitis,et al.  Unsupervised dictionary extraction of bird vocalisations and new tools on assessing and visualising bird activity , 2015, Ecol. Informatics.

[16]  T. Mitchell Aide,et al.  Real-time bioacoustics monitoring and automated species identification , 2013, PeerJ.

[17]  Huy Phan,et al.  Learning Representations for Nonspeech Audio Events Through Their Similarities to Speech Patterns , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[18]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Dan Stowell,et al.  Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning , 2014, PeerJ.

[20]  I. Potamitis Automatic Classification of a Taxon-Rich Community Recorded in the Wild , 2014, PloS one.

[21]  Eduardo Coutinho,et al.  Transfer learning emotion manifestation across music and speech , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[22]  Luiz Eduardo Soares de Oliveira,et al.  Music genre classification using LBP textural features , 2012, Signal Process..

[23]  Myung Jong Kim,et al.  Cross-acoustic transfer learning for sound event classification , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[25]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[26]  Alexander Rehding,et al.  Music theory and natural order from the Renaissance to the early twentieth century , 2005 .

[27]  Constantine Kotropoulos,et al.  Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Ian H. Witten,et al.  Weka-A Machine Learning Workbench for Data Mining , 2005, Data Mining and Knowledge Discovery Handbook.

[29]  Klaus Riede,et al.  Automatic bird sound detection in long real-field recordings: Applications and tools , 2014 .

[30]  Aki Härmä Automatic identification of bird species based on sinusoidal modeling of syllables , 2003, ICASSP.

[31]  Jean-Pierre Martens,et al.  Connected Digit Recognition by Means of Reservoir Computing , 2011, INTERSPEECH.

[32]  Frank Kurth,et al.  Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring , 2010, Pattern Recognit. Lett..

[33]  Rachel T. Buxton,et al.  Measuring nocturnal seabird activity and status using acoustic recording devices: applications for island restoration , 2012 .

[34]  Alessandro Lameiras Koerich,et al.  Automatic Bird Species Identification for Large Number of Species , 2011, 2011 IEEE International Symposium on Multimedia.

[35]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[36]  Andrew Wolfgang,et al.  Testing Automated Call-Recognition Software for Winter Bird Vocalizations , 2016, Northeastern Naturalist.

[37]  Marc Leman,et al.  Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.