Automatic syllables segmentation for frog identification system

Automatic recognition of frog sound according to particular species is considered a worthy tool for biological research and environmental monitoring. As a result, automatic recognition of frog sound offers many advantages rather than manual method that depending on physical observation procedure. This study evaluates the accuracy of frog sound identification from 12 species that recorded from Malaysia forest. By applying short time energy and short time average zero crossing rate, the frog sound samples are automatically segmented into syllables. A syllable feature extraction method i.e, Mel-Frequency Cepstrum Coefficients is employed to extract the segmented signal. Finally, nonparametric k-nearest neighbor classifier with Euclidean distance has been employed to recognize the frog species. A comparison between automatic segmentation and manual segmentation is applied and results show that automatic segmentation outperforms to identify the frog species with an accuracy of 97% compared to 82.33% for manual segmentation.

[1]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[2]  Ye Tian,et al.  Nonspeech segment rejection based on prosodic information for robust speech recognition , 2002, IEEE Signal Processing Letters.

[3]  Goujun Lu,et al.  Indexing and Retrieval of Audio: A Survey , 2001, Multimedia Tools and Applications.

[4]  Bhavani M. Thuraisingham,et al.  Face Recognition Using Multiple Classifiers , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[5]  Sergios Theodoridis,et al.  A novel efficient approach for audio segmentation , 2008, 2008 19th International Conference on Pattern Recognition.

[6]  Salina Abdul Samad,et al.  Score Information Decision Fusion Using Support Vector Machine for a Correlation Filter Based Speaker Authentication System , 2008, CISIS.

[7]  G. Pavan,et al.  Bioacoustics approaches in biodiversity inventories , 2010 .

[8]  Chong Mun Ho,et al.  Classification and identification of frog sound based on entropy approach , 2011 .

[9]  Mark D. Plumbley,et al.  Birdsong and C4DM: A survey of UK birdsong and machine recognition for music researchers , 2011 .

[10]  S. Jothilakshmi,et al.  Robust Automatic Continuous Speech Segmentation for Indian Languages to Improve Speech to Speech Translation , 2012 .

[11]  Chun-Cheng Lin,et al.  Automatic recognition of frog calls using a multi-stage average spectrum , 2012, Comput. Math. Appl..

[12]  Runzhi Li,et al.  Research on voice activity detection in burst and partial duration noisy environment , 2012, 2012 International Conference on Audio, Language and Image Processing.

[13]  Mijanur Rahman,et al.  Continuous Bangla Speech Segmentation using Short-term Speech Features Extraction Approaches , 2012 .