Multi-Instance Multi-Label Class Discovery: A Computational Approach for Assessing Bird Biodiversity

We study the problem of analyzing a large volume of bio-acoustic data collected in-situ with the goal of assessing the biodiversity of bird species at the data collection site. We are interested in the class discovery problem for this setting. Specifically, given a large collection of audio recordings containing bird and other sounds, we aim to automatically select a fixed size subset of the recordings for human expert labeling such that the maximum number of species/classes is discovered. We employ a multi-instance multi-label representation to address multiple simultaneously vocalizing birds with sounds that overlap in time, and propose new algorithms for species/class discovery using this representation. In a comparative study, we show that the proposed methods discover more species/classes than current state-of-the-art in a real world dataset of 92,095 ten-second recordings collected in field conditions.

[1]  Xiaoli Z. Fern,et al.  Acoustic classification of multiple simultaneous bird species: a multi-instance multi-label approach. , 2012, The Journal of the Acoustical Society of America.

[2]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[3]  Paul Roe,et al.  Sampling environmental acoustic recordings to determine bird species richness. , 2013, Ecological applications : a publication of the Ecological Society of America.

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Andrew W. Moore,et al.  Active Learning for Anomaly and Rare-Category Detection , 2004, NIPS.

[6]  Xiaoli Z. Fern,et al.  Instance Annotation for Multi-Instance Multi-Label Learning , 2013, TKDD.

[7]  Jingrui He,et al.  Nearest-Neighbor-Based Active Learning for Rare Category Detection , 2007, NIPS.

[8]  J. Grinnell,et al.  Check-List of North American Birds American Ornithologists' Union , 1910 .

[9]  Paul R. Ehrlich,et al.  Ecosystem consequences of bird declines , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Zhi-Hua Zhou,et al.  Multi-instance multi-label learning , 2008, Artif. Intell..

[11]  Weng-Keen Wong,et al.  Category detection using hierarchical mean shift , 2009, KDD.

[12]  Grigorios Tsoumakas,et al.  The 9th annual MLSP competition: New methods for acoustic classification of multiple simultaneous bird species in a noisy environment , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).