Automatic acoustic detection of birds through deep learning: The first Bird Audio Detection challenge

Assessing the presence and abundance of birds is important for monitoring specific species as well as overall ecosystem health. Many birds are most readily detected by their sounds, and thus passive acoustic monitoring is highly appropriate. Yet acoustic monitoring is often held back by practical limitations such as the need for manual configuration, reliance on example sound libraries, low accuracy, low robustness, and limited ability to generalise to novel acoustic conditions. Here we report outcomes from a collaborative data challenge. We present new acoustic monitoring datasets, summarise the machine learning techniques proposed by challenge teams, conduct detailed performance evaluation, and discuss how such approaches to detection can be integrated into remote monitoring projects. Multiple methods were able to attain performance of around 88% AUC (area under the ROC curve), much higher performance than previous general‐purpose methods. With modern machine learning including deep learning, general‐purpose acoustic bird detection can achieve very high retrieval rates in remote monitoring data with no manual recalibration, and no pre‐training of the detector for the target species or the acoustic conditions in the target environment.

[1]  B. Furnas,et al.  Using automated recorders and occupancy models to monitor common forest birds across a large geographic region , 2015 .

[2]  Thomas Grill,et al.  Two convolutional neural networks for bird detection in audio signals , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[3]  VirtanenTuomas,et al.  Detection and Classification of Acoustic Scenes and Events , 2018 .

[4]  Eduardo Freire Nakamura,et al.  An incremental technique for real-time bioacoustic signal segmentation , 2015, Expert Syst. Appl..

[5]  Padmanabhan Rajan,et al.  Rapid bird activity detection using probabilistic sequence kernels , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[6]  Hervé Glotin,et al.  LifeCLEF Bird Identification Task 2016: The arrival of Deep learning , 2016, CLEF.

[7]  Dan Stowell,et al.  Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning , 2014, PeerJ.

[8]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[9]  Tuomas Virtanen,et al.  Convolutional recurrent neural networks for bird audio detection , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[10]  G. Mayr,et al.  Pectoral girdle morphology of Mesozoic birds and the evolution of the avian supracoracoideus muscle , 2017, Journal of Ornithology.

[11]  Maria Sandsten,et al.  Classification of bird song syllables using Wigner-Ville ambiguity function cross-terms , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[12]  Dan Stowell,et al.  An Open Dataset for Research on Audio Field Recording Archives: freefield1010 , 2013, Semantic Audio.

[13]  Nick Beresford,et al.  The wildlife of Chernobyl: 30 years without man , 2016 .

[14]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[15]  Tuomas Virtanen,et al.  Stacked convolutional and recurrent neural networks for bird audio detection , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[16]  Steve Kelling,et al.  Fusing shallow and deep learning for bioacoustic bird species classification , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Dan Stowell,et al.  Deductive refinement of species labelling in weakly labelled birdsong recordings , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Mark A. Girolami,et al.  Bat detective—Deep learning tools for bat acoustic signal detection , 2017, bioRxiv.

[19]  Dan Stowell,et al.  Deep Learning for Audio Event Detection and Tagging on Low-Resource Datasets , 2018, Applied Sciences.

[20]  Stephen R. Baillie,et al.  Species traits explain variation in detectability of UK birds , 2014 .

[21]  Julián Urbano Merino,et al.  Evaluation in audio music similarity , 2013 .

[22]  Alex Rogers,et al.  AudioMoth: Evaluation of a smart open acoustic device for monitoring biodiversity and the environment , 2018 .

[23]  L. Joppa The case for technology investments in the environment , 2017, Nature.

[24]  Wieslaw Wszolek,et al.  Adaptation of deep learning methods to nocturnal bird audio monitoring , 2017 .

[25]  T. Mitchell Aide,et al.  Real-time bioacoustics monitoring and automated species identification , 2013, PeerJ.

[26]  Frédéric Jiguet,et al.  Observed and predicted effects of climate change on species abundance in protected areas , 2013 .

[27]  Nick Beresford,et al.  European bison (Bison bonasus) in the Chornobyl exclusion zone (Ukraine) and prospects for its revival , 2017 .

[28]  Anil Kumar Sao,et al.  Archetypal analysis based sparse convex sequence kernel for bird activity detection , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[29]  Heikki Huttunen,et al.  Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[30]  Johannes Kamp,et al.  Unstructured citizen science data fail to detect long‐term population declines of common birds in Denmark , 2016 .

[31]  Hiroshi G. Okuno,et al.  Acoustic Monitoring of the Great Reed Warbler Using Multiple Microphone Arrays and Robot Audition , 2017, J. Robotics Mechatronics.

[32]  Justin Salamon,et al.  Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.

[33]  Marta Cascante,et al.  Model-driven discovery of long-chain fatty acid metabolic reprogramming in heterogeneous prostate cancer cells , 2018, PLoS Comput. Biol..

[34]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[35]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[36]  Dan Stowell,et al.  Approaches to Complex Sound Scene Analysis , 2018 .

[37]  Dan Stowell,et al.  Deep Learning for Audio Transcription on Low-Resource Datasets , 2018, ArXiv.

[38]  Karl-Heinz Frommolt,et al.  Information obtained from long-term acoustic recordings: applying bioacoustic techniques for monitoring wetland birds during breeding season , 2017, Journal of Ornithology.

[39]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[40]  Erin M. Bayne,et al.  Recommendations for acoustic recognizer performance assessment with application to five common automated signal recognition programs , 2017 .

[41]  P. Tyack,et al.  Estimating animal population density using passive acoustics , 2012, Biological reviews of the Cambridge Philosophical Society.

[42]  Michael Towsey,et al.  A practical comparison of manual and autonomous methods for acoustic monitoring , 2013 .

[43]  Paul Roe,et al.  A toolbox for animal call recognition , 2012 .

[44]  Christian Dietz,et al.  A continental-scale tool for acoustic identification of European bats , 2012 .

[45]  Yong Xu,et al.  Joint detection and classification convolutional neural network on weakly labelled bird audio detection , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[46]  Hervé Glotin,et al.  Bird detection in audio: A survey and a challenge , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[47]  Richard L. Hutto,et al.  Humans versus autonomous recording units: a comparison of point-count results , 2009 .