HARK-Bird-Box: A Portable Real-time Bird Song Scene Analysis System

This paper addresses real-time bird song scene analysis. Observation of animal behavior such as communication of wild birds would be aided by a portable device implementing a real-time system that can localize sound sources, measure their timing, classify their sources, and visualize these factors of sources. The difficulty of such a system is an integration of these functions considering the real-time requirement. To realize such a system, we propose a cascaded approach, cascading sound source detection, localization, separation, feature extraction, classification, and visualization for bird song analysis. Our system is constructed by combining an open source software for robot audition called HARK and a deep learning library to implement a bird song classifier based on a convolutional neural network (CNN). Considering portability, we implemented this system on a single-board computer, Jetson TX2, with a microphone array and developed a prototype device for bird song scene analysis. A preliminary experiment confirms a computational time for the whole system to realize a real-time system. Also, an additional experiment with a bird song dataset revealed a trade-off relationship between classification accuracy and time consuming and the effectiveness of our classifier.

[1]  David R. Wilson,et al.  Field test of an affordable, portable, wireless microphone array for spatial monitoring of animal ecology and behaviour , 2012 .

[2]  Kazuhiro Nakadai,et al.  Bird Song Scene Analysis Using a Spatial-Cue-Based Probabilistic Model , 2017, J. Robotics Mechatronics.

[3]  Gerhard Tröster,et al.  AmbientSense: A real-time ambient sound recognition system for smartphones , 2013, 2013 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops).

[4]  Dan Stowell,et al.  Computational Bioacoustic Scene Analysis , 2018 .

[5]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[6]  Rosa Maria Alsina Pages,et al.  An FPGA-Based WASN for Remote Real-Time Monitoring of Endangered Species: A Case Study on the Birdsong Recognition of Botaurus stellaris , 2017, Sensors.

[7]  Hiroshi Sawada,et al.  Bayesian Nonparametrics for Microphone Array Processing , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Makoto Kumon,et al.  Design and Assessment of Sound Source Localization System with a UAV-Embedded Microphone Array , 2017, J. Robotics Mechatronics.

[9]  Mark D. Plumbley,et al.  Computational Analysis of Sound Scenes and Events , 2017 .

[10]  T. Mitchell Aide,et al.  Real-time bioacoustics monitoring and automated species identification , 2013, PeerJ.

[11]  Wei Pan,et al.  SoundSense: scalable sound sensing for people-centric applications on mobile phones , 2009, MobiSys '09.

[12]  P. Slater,et al.  Bird Song: Biological Themes and Variations , 1995 .

[13]  Keisuke Nakamura,et al.  Interactive sound source localization using robot audition for tablet devices , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  Kazuhiro Nakadai,et al.  A spatiotemporal analysis of acoustic interactions between great reed warblers (Acrocephalus arundinaceus) using microphone arrays and robot audition software HARK , 2017, Ecology and evolution.

[15]  Hideki Tachibana,et al.  Visualization of sound propagation and scattering in rooms , 2002 .

[16]  Hervé Glotin,et al.  LifeCLEF Bird Identification Task 2016: The arrival of Deep learning , 2016, CLEF.

[17]  Katsutoshi Itoyama,et al.  Visualization of auditory awareness based on sound source positions estimated by depth sensor and microphone array , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.