Three-Parametric Cubic Interpolation for Estimating the Fundamental Frequency of the Speech Signal

In this paper, we propose a three-parametric convolution kernel which is based on the one-parameter Keys kernel. The first part of the paper describes the structure of the three-parameter convolution kernel. Then, a certain analytical expression for finding the position of the maximum of the reconstructed function is given. The second part presents an algorithm for estimating the fundamental frequency of the speech signal processing in the frequency domain using Picking Picks methods and parametric cubic convolution. Furthermore, the results of experiments give the estimated fundamental frequency of speech and sinusoidal signals in order to select the optimal values of the parameters of the proposed convolution kernel. The results of the fundamental frequency estimation according to the mean square error are given by tables and graphics. Consequently, it is used as a basis for a comparative analysis. The analysis derived the optimal parameters of the kernel and the window function that generates the least MSE. Results showed a higher efficiency in comparison to two or three-parameter convolution kernel.

[1]  M. Ross,et al.  Average magnitude difference function pitch extractor , 1974 .

[2]  Z. Milivojevic,et al.  Estimation of the fundamental frequency of the speech signal modeled by the SYMPES method , 2009 .

[3]  Zoran N. Milivojević,et al.  Fundamental Frequency Estimation of the Speech Signal Compressed by MP3 Algorithm Using PCC Interpolation , 2010 .

[4]  Michael S. Scordilis,et al.  Analysis, enhancement and evaluation of five pitch determination techniques , 2002, Speech Commun..

[5]  Michael Unser,et al.  A note on cubic convolution interpolation , 2003, IEEE Trans. Image Process..

[6]  Bumki Jeon Filtering of a dissonant frequency based on improved fundamental frequency estimation for speech enhancement , 2003 .

[7]  Darko Brodić,et al.  The Effects of the Acute Hypoxia to the Fundamental Frequency of the Speech Signal , 2012 .

[8]  R. Keys Cubic convolution interpolation for digital image processing , 1981 .

[9]  Francis Grenez,et al.  Time-frequency analysis and instantaneous frequency estimation using two-sided linear prediction , 2005, Signal Process..

[10]  Hee Suk Pang Improved Fundamental Frequency Estimation Using Parametric Cubic Convolution , 2000 .

[11]  Wei-Ping Zhu,et al.  Pitch Estimation Based on a Harmonic Sinusoidal Autocorrelation Model and a Time-Domain Matching Scheme , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Robert A. Schowengerdt,et al.  Image reconstruction by parametric cubic convolution , 1982, Comput. Graph. Image Process..

[13]  Anssi Klapuri,et al.  Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[14]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[15]  Thippur V. Sreenivas,et al.  Effect of interpolation on PWVD computation and instantaneous frequency estimation , 2004, Signal Process..

[16]  B. Atal Automatic Speaker Recognition Based on Pitch Contours , 1969 .

[17]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[18]  Sangki Kang,et al.  A Dissonant Frequency Filtering for Enhanced Clarity of Husky Voice Signals , 2006, TSD.

[19]  Sangki Kang,et al.  Dissonant frequency filtering technique for improving perceptual quality of noisy speech and husky voice , 2004, Signal Process..

[20]  Zoran N. Milivojević,et al.  AN ESTIMATE OF FUNDAMENTAL FREQUENCY USING PCC INTERPOLATION - COMPARATIVE ANALYSIS , 2006 .

[21]  Jae S. Lim,et al.  Multiband excitation vocoder , 1988, IEEE Transactions on Acoustics, Speech, and Signal Processing.

[22]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[23]  Darko Brodić,et al.  Estimation of the Fundamental Frequency of the Speech Signal Compressed by G.723.1 Algorithm Applying PCC Interpolation , 2011 .

[24]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[25]  Boualem Boashash,et al.  Adaptive instantaneous frequency estimation of multicomponent FM signals using quadratic time-frequency distributions , 2002, IEEE Trans. Signal Process..

[26]  W. Bastiaan Kleijn,et al.  Estimation of the Instantaneous Pitch of Speech , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[27]  Hakan Gurkan,et al.  On the comparative results of “SYMPES: A new method of speech modeling” , 2006 .

[28]  Robert A. Schowengerdt,et al.  Image reconstruction by parametric cubic convolution , 1982, Comput. Graph. Image Process..

[29]  Pavol Božek,et al.  Registration of holographic images based on the integral transformation , 2013 .