Detection of ground parrot vocalisation: A multiple instance learning approach.

Ground parrot vocalisation can be considered as an audio event. Test-based diverse density multiple instance learning (TB-DD-MIL) is proposed for detecting this event in audio files recorded in the field. The proposed method is motivated by the advantages of multiple instance learning from incomplete training data. Spectral features suitable for encoding the vocal source information of the ground parrot vocalization are also investigated. The proposed method was benchmarked against a dataset collected in various environmental conditions and an audio detection evaluation scheme is proposed. The evaluation includes a study on performance of the various vocal source features and comparison with other classification techniques. Experimental results indicated that the most appropriate feature to encode ground parrot calls is the spectral bandwidth and the proposed TB-DD-MIL method outperformed other existing classification methods.

[1]  H. C. Card,et al.  Birdsong recognition using backpropagation and multivariate statistics , 1997, IEEE Trans. Signal Process..

[2]  Xiaoli Z. Fern,et al.  Acoustic classification of multiple simultaneous bird species: a multi-instance multi-label approach. , 2012, The Journal of the Acoustical Society of America.

[3]  Frank Kurth,et al.  Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring , 2010, Pattern Recognit. Lett..

[4]  D Margoliash,et al.  Template-based automatic recognition of birdsong syllables from continuous recordings. , 1996, The Journal of the Acoustical Society of America.

[5]  J A Kogan,et al.  Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: a comparative study. , 1998, The Journal of the Acoustical Society of America.

[6]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[7]  Juha T. Tanttu,et al.  Wavelets in Recognition of Bird Sounds , 2007, EURASIP J. Adv. Signal Process..

[8]  James R. Foulds,et al.  A review of multi-instance learning assumptions , 2010, The Knowledge Engineering Review.

[9]  Germán Castellanos-Domínguez,et al.  Enhancing the dissimilarity-based classification of birdsong recordings , 2016, Ecol. Informatics.

[10]  Chia-Feng Juang,et al.  Birdsong recognition using prediction-based recurrent neural fuzzy networks , 2007, Neurocomputing.

[11]  Panu Somervuo,et al.  Parametric Representations of Bird Sounds for Automatic Species Recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Zhixin Chen,et al.  Semi-automatic classification of bird vocalizations using spectral peak tracks. , 2006, The Journal of the Acoustical Society of America.

[13]  Sampath Srinivas,et al.  A Generalization of the Noisy-Or Model , 1993, UAI.

[14]  Douglas A. Reynolds,et al.  Modeling of the glottal flow derivative waveform with application to speaker identification , 1999, IEEE Trans. Speech Audio Process..

[15]  Douglas A. Reynolds,et al.  The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspective , 2000, Speech Commun..

[16]  Seppo Ilmari Fagerlund,et al.  Bird Species Recognition Using Support Vector Machines , 2007, EURASIP J. Adv. Signal Process..