A call-independent and automatic acoustic system for the individual recognition of animals: A novel model using four passerines

Research into acoustic recognition systems for animals has focused on call-dependent and species identification rather than call-independent and individual identification. Here we present a system for automatic call-independent individual recognition using mel-frequency cepstral coefficients and Gaussian mixture models across four passerine species. To our knowledge this is the first application of these techniques to the individual recognition of birds, and the results are promising. Accuracies of 89.1-92.5% were achieved and the acoustic feature and classifier method developed here have excellent potential for individual animal recognition and can be easily applied to other species.

[1]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[2]  Charles E Taylor,et al.  Automated species recognition of antbirds in a Mexican rainforest using hidden Markov models. , 2008, The Journal of the Acoustical Society of America.

[3]  A. Heberlein,et al.  Bioacoustic behavior of African fishes (Mormyridae): potential cues for species and individual recognition in Pollimyrus. , 1997, The Journal of the Acoustical Society of America.

[4]  Chin-Chuan Han,et al.  Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis , 2006, Pattern Recognit. Lett..

[5]  Chenn-Jung Huang,et al.  Frog classification using machine learning techniques , 2009, Expert Syst. Appl..

[6]  Elizabeth J. S. Fox,et al.  A new perspective on acoustic individual recognition in animals with limited call sharing or changing repertoires , 2008, Animal Behaviour.

[7]  M. A. Bee,et al.  Individual variation in advertisement calls of territorial male green frogs, Rana clamitans : implications for individual discrimination , 2001 .

[8]  J. Hildebrand,et al.  Gaussian mixture model classification of odontocetes in the Southern California Bight and the Gulf of California. , 2007, The Journal of the Acoustical Society of America.

[9]  D. Weary,et al.  Song features birds use to identify individuals , 1990 .

[10]  Mohammed Bennamoun,et al.  Text-independent speaker identification in birds , 2006, INTERSPEECH.

[11]  Jhing-Fa Wang,et al.  Chip design of MFCC extraction for speech recognition , 2002, Integr..

[12]  Biing-Hwang Juang,et al.  Speech recognition in adverse environments , 1991 .

[13]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[14]  T. Dabelsteen,et al.  A POTENTIAL TOOL FOR SWIFT FOX (VULPES VELOX) CONSERVATION: INDIVIDUALITY OF LONG-RANGE BARKING SEQUENCES , 2003 .

[15]  D. Blumstein,et al.  Individual, age and sex-specific information is contained in yellow-bellied marmot alarm calls , 2005, Animal Behaviour.

[16]  Ki Yong Lee Local fuzzy PCA based GMM with dimension reduction on speaker identification , 2004, Pattern Recognit. Lett..

[17]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[18]  Chia-Feng Juang,et al.  Birdsong recognition using prediction-based recurrent neural fuzzy networks , 2007, Neurocomputing.

[19]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[20]  Patrick J Clemins,et al.  Generalized perceptual linear prediction features for animal vocalization analysis. , 2006, The Journal of the Acoustical Society of America.

[21]  Héctor Corrada Bravo,et al.  Automated classification of bird and amphibian calls using machine learning: A comparison of methods , 2009, Ecol. Informatics.

[22]  François Pachet,et al.  Classification of dog barks: a machine learning approach , 2008, Animal Cognition.

[23]  Bhaskar D. Rao,et al.  PDF optimized parametric vector quantization of speech line spectral frequencies , 2003, IEEE Trans. Speech Audio Process..

[24]  Abeer Alwan,et al.  Frequency warping for VTLN and speaker adaptation by linear transformation of standard MFCC , 2009, Comput. Speech Lang..

[25]  D. Macdonald,et al.  Barking foxes, Alopex lagopus: field experiments in individual recognition in a territorial mammal , 2003, Animal Behaviour.

[26]  Charles E. Taylor,et al.  Unsupervised Acoustic Classification of Bird Species Using Hierarchical Self-organizing Maps , 2007, ACAL.

[27]  Seppo Ilmari Fagerlund,et al.  Bird Species Recognition Using Support Vector Machines , 2007, EURASIP J. Adv. Signal Process..

[28]  Panu Somervuo Competing hidden Markov models on the self-organizing map , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[29]  Aki Härmä Automatic identification of bird species based on sinusoidal modeling of syllables , 2003, ICASSP.

[30]  John S. D. Mason,et al.  A comparison of composite features under degraded speech in speaker recognition , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31]  M.T. Johnson,et al.  Automatic Song-Type Classification and Speaker Identification of Norwegian Ortolan Bunting (Emberiza Hortulana) Vocalizations , 2005, 2005 IEEE Workshop on Machine Learning for Signal Processing.

[32]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[33]  Michael T. Johnson,et al.  Automatic classification and speaker identification of African elephant (Loxodonta africana) vocalizations. , 2003 .

[34]  J A Kogan,et al.  Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: a comparative study. , 1998, The Journal of the Acoustical Society of America.

[35]  D Margoliash,et al.  Template-based automatic recognition of birdsong syllables from continuous recordings. , 1996, The Journal of the Acoustical Society of America.

[36]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[37]  P. McGregor,et al.  The role of vocal individuality in conservation , 2005, Frontiers in Zoology.

[38]  Douglas D. O'Shaughnessy,et al.  Compensated mel frequency cepstrum coefficients , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[39]  Mohammed Bennamoun,et al.  CALL-INDEPENDENT INDIVIDUAL IDENTIFICATION IN BIRDS , 2008 .

[40]  Joseph Picone,et al.  Signal modeling techniques in speech recognition , 1993, Proc. IEEE.

[41]  D.A. Reynolds,et al.  Large population speaker identification using clean and telephone speech , 1995, IEEE Signal Processing Letters.

[42]  H. Sorenson,et al.  Recursive bayesian estimation using gaussian sums , 1971 .

[43]  Ping Wang,et al.  A computer-aided MFCC-based HMM system for automatic auscultation , 2008, Comput. Biol. Medicine.

[44]  Richard J. Mammone,et al.  Speaker recognition - general classifier approaches and data fusion methods , 2002, Pattern Recognit..

[45]  Renata S. Sousa-Lima,et al.  Signature information and individual recognition in the isolation calls of Amazonian manatees, Trichechus inunguis (Mammalia: Sirenia) , 2002, Animal Behaviour.