Automatic recognition of frog calls using a multi-stage average spectrum

The automatic recognition of animal sounds is one of the powerful techniques for replacing the traditional ecological survey method that mainly depends on manpower, which is hence both costly and time consuming. This study developed an automatic frog call recognition system based on the combination of a pre-classification method of the syllable lengths and a multi-stage average spectrum (MSAS) method. In this system, the input frog syllables are first classified into one of the four groups determined by the pre-classification method according to syllable length. Then the proposed MSAS method is used to extract the standard feature template to analyze the time-varying features of each frog species and to recognize the input frog syllable by a template matching method. In all, 960 syllables recorded from 18 frog species are included in this study to evaluate the accuracy of the proposed frog call recognition system. The experimental results demonstrate that the proposed one-level (using the MSAS method only) and two-level (combining the syllable length pre-classification and MSAS methods) recognition methods can provide the best recognition accuracies of 91.9% and 94.3%, respectively, compared with other recognition methods based on dynamic time warping (DTW), spectral ensemble average voice prints (SEAV), k-nearest neighbor (kNN) and support vector machines (SVMs).

[1]  Luis González Abril,et al.  Support vector machines for classification of input vectors with different metrics , 2011, Comput. Math. Appl..

[2]  Chin-Chuan Han,et al.  Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis , 2006, Pattern Recognit. Lett..

[3]  S. Ishimitsu,et al.  Construction of the Noise-Robust Body-Conducted Speech Recognition System , 2007, Second International Conference on Innovative Computing, Informatio and Control (ICICIC 2007).

[4]  Thippur V. Sreenivas,et al.  Optimum Transform Domain Split VQ , 2008, IEEE Signal Processing Letters.

[5]  Navid Razmjooy,et al.  A real-time mathematical computer method for potato inspection using machine vision , 2012, Comput. Math. Appl..

[6]  Lie Lu,et al.  Digital Object Identifier (DOI) 10.1007/s00530-002-0065-0 Multimedia Systems , 2003 .

[7]  Seppo Ilmari Fagerlund,et al.  Bird Species Recognition Using Support Vector Machines , 2007, EURASIP J. Adv. Signal Process..

[8]  Andrew Taylor,et al.  Monitoring Frog Communities: An Application of Machine Learning , 1996, AAAI/IAAI, Vol. 2.

[9]  Aaron E. Rosenberg,et al.  An improved endpoint detector for isolated word recognition , 1981 .

[10]  A. Antoniou Digital Signal Processing: Signals, Systems, and Filters , 2005 .

[11]  Xufang Zhao,et al.  A new hybrid approach for automatic speech signal segmentation using silence signal detection, energy convex hull, and spectral variation , 2008, 2008 Canadian Conference on Electrical and Computer Engineering.

[12]  Panu Somervuo,et al.  Parametric Representations of Bird Sounds for Automatic Species Recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  P. Danielsson Euclidean distance mapping , 1980 .

[14]  Tiejun Liu,et al.  Ultrasonic image classification based on support vector machine with two independent component features , 2011, Comput. Math. Appl..

[15]  Anil Prabhakar,et al.  Automatic identification of bird calls using Spectral Ensemble Average Voice Prints , 2006, 2006 14th European Signal Processing Conference.

[16]  C. L. Li,et al.  Feature recognition by template matching , 2000, Comput. Graph..

[17]  J A Kogan,et al.  Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: a comparative study. , 1998, The Journal of the Acoustical Society of America.

[18]  Aaron E. Rosenberg,et al.  Performance tradeoffs in dynamic time warping algorithms for isolated word recognition , 1980 .

[19]  Chenn-Jung Huang,et al.  Frog classification using machine learning techniques , 2009, Expert Syst. Appl..

[20]  Ye Tian,et al.  Nonspeech segment rejection based on prosodic information for robust speech recognition , 2002, IEEE Signal Processing Letters.

[21]  K. Vierling,et al.  Experimental and Ecological Implications of Evening Bird Surveys in Stream-Riparian Ecosystems , 2009, Environmental management.