Exploiting the Symmetry of Integral Transforms for Featuring Anuran Calls

The application of machine learning techniques to sound signals requires the previous characterization of said signals. In many cases, their description is made using cepstral coefficients that represent the sound spectra. In this paper, the performance in obtaining cepstral coefficients by two integral transforms, Discrete Fourier Transform (DFT) and Discrete Cosine Transform (DCT), are compared in the context of processing anuran calls. Due to the symmetry of sound spectra, it is shown that DCT clearly outperforms DFT, and decreases the error representing the spectrum by more than 30%. Additionally, it is demonstrated that DCT-based cepstral coefficients are less correlated than their DFT-based counterparts, which leads to a significant advantage for DCT-based cepstral coefficients if these features are later used in classification algorithms. Since the DCT superiority is based on the symmetry of sound spectra and not on any intrinsic advantage of the algorithm, the conclusions of this research can definitely be extrapolated to include any sound signal.

[1]  Enrique Personal,et al.  Evaluation of the Processing Times in Anuran Sound Classification , 2017, Wirel. Commun. Mob. Comput..

[2]  Zheng Fang,et al.  Comparison of different implementations of MFCC , 2001 .

[3]  Yongwha Chung,et al.  Noise-Robust Sound-Event Classification System with Texture Analysis , 2018, Symmetry.

[4]  Prajoy Podder,et al.  Comparative Performance Analysis of Hamming, Hanning and Blackman Window , 2014 .

[5]  Julio Barbancho,et al.  Improving Classification Algorithms by Considering Score Series in Wireless Acoustic Sensor Networks , 2018, Sensors.

[6]  Paul Roe,et al.  Frog call classification: a survey , 2016, Artificial Intelligence Review.

[7]  P. Malathi,et al.  Speaker dependent speech emotion recognition using MFCC and Support Vector Machine , 2016, 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT).

[8]  Zhen Su,et al.  Cutting Pattern Identification for Coal Mining Shearer through Sound Signals Based on a Convolutional Neural Network , 2018, Symmetry.

[9]  Yongwha Chung,et al.  Fault Detection and Diagnosis of Railway Point Machines by Sound Analysis , 2016, Sensors.

[10]  Julio Barbancho,et al.  Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks , 2018, Sensors.

[11]  Trang Nguyen,et al.  Race Recognition Using Deep Convolutional Neural Networks , 2018, Symmetry.

[12]  Alejandro Carrasco,et al.  Temporally-aware algorithms for the classification of anuran sounds , 2018, PeerJ.

[13]  Douglas D. O'Shaughnessy,et al.  Speech communication : human and machine , 1987 .

[14]  Isabelle Guyon,et al.  An Introduction to Feature Extraction , 2006, Feature Extraction.

[15]  Thomas Fang Zheng,et al.  Comparison of different implementations of MFCC , 2001, Journal of Computer Science and Technology.

[16]  Julio Barbancho,et al.  Non-sequential automatic classification of anuran sounds for the estimation of climate-change indicators , 2018, Expert Syst. Appl..

[17]  P. Yip,et al.  Discrete Cosine Transform: Algorithms, Advantages, Applications , 1990 .

[18]  Jianzhong Wang,et al.  Urban noise recognition with convolutional neural network , 2018, Multimedia Tools and Applications.

[19]  Eduardo Freire Nakamura,et al.  Feature evaluation for unsupervised bioacoustic signal segmentation of anuran calls , 2018, Expert Syst. Appl..

[20]  Qingfang Meng,et al.  Heart sound identification based on MFCC and short-term energy , 2017, 2017 Chinese Automation Congress (CAC).

[21]  Joan Claudi Socoró,et al.  A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Music and Environmental Sounds , 2016 .

[22]  Mohammed Usman,et al.  Probabilistic Modeling of Speech in Spectral Domain using Maximum Likelihood Estimation , 2018, Symmetry.

[23]  Mihail Popescu,et al.  Walk Identification using a smart carpet and Mel-Frequency Cepstral Coefficient (MFCC) features , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[24]  David Sánchez-Rodríguez,et al.  A Methodology Based on Bioacoustic Information for Automatic Identification of Reptiles and Anurans , 2018, Reptiles and Amphibians.

[25]  Juan Ignacio Godino-Llorente,et al.  On the design of automatic voice condition analysis systems. Part II: Review of speaker recognition techniques and study on the effects of different variability factors , 2019, Biomed. Signal Process. Control..

[26]  Ranjan Parekh,et al.  Improved Musical Instrument Classification Using Cepstral Coefficients and Neural Networks , 2018 .

[28]  Wei Dai,et al.  Very deep convolutional neural networks for raw waveforms , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Goutam Saha,et al.  Spectral Features for Synthetic Speech Detection , 2017, IEEE Journal of Selected Topics in Signal Processing.

[30]  Marielle Malfante,et al.  Automatic fish sounds classification. , 2016, The Journal of the Acoustical Society of America.

[31]  Benjamin Munson,et al.  Supervised and unsupervised machine learning approaches to classifying chimpanzee vocalizations , 2018 .

[32]  Li Tan,et al.  Digital Signal Processing: Fundamentals and Applications , 2013 .

[33]  Christoph Busch,et al.  Classification of Acceleration Data for Biometric Gait Recognition on Mobile Devices , 2011, BIOSIG.

[34]  Amirtaha Taebi,et al.  Analysis of seismocardiographic signals using polynomial chirplet transform and smoothed pseudo Wigner-Ville distribution , 2017, 2017 IEEE Signal Processing in Medicine and Biology Symposium (SPMB).

[35]  Mark Bush,et al.  Anuran call classification with deep learning , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[36]  Aren Jansen,et al.  CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  Yan Song,et al.  Robust sound event recognition using convolutional neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Namrata Dave,et al.  Feature Extraction Methods LPC, PLP and MFCC In Speech Recognition , 2013 .

[39]  Kurt Bryan,et al.  Discrete Fourier Analysis and Wavelets: Applications to Signal and Image Processing , 2008 .

[40]  Arul Valiyavalappil Haridas,et al.  A critical review and analysis on techniques of speech recognition: The road ahead , 2018, Int. J. Knowl. Based Intell. Eng. Syst..

[41]  Chee-Ming Ting,et al.  Analysis of ECG biosignal recognition for client identifiction , 2017, 2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA).

[42]  João Gama,et al.  Automatic Classification of Anuran Sounds Using Convolutional Neural Networks , 2016, C3S2E.

[43]  S. Dixon,et al.  A computational study on outliers in world music , 2017, PloS one.

[44]  Chong Mun Ho,et al.  Classification and identification of frog sound based on entropy approach , 2011 .