RPSOVF Prediction Model for Speech Signal Series Based on UPSO

In this paper, we propose a nonlinear prediction model of speech signal series with an explicit structure. In order to overcome some intrinsic shortcomings, such as traps at the local minimum, improper selection of parameters, and slow convergence rate, which are always caused by improper parameters generated by, typically, the low performance of least mean square (LMS) in updating kernel coefficients of the Volterra model, a uniform searching particle swarm optimization (UPSO) algorithm to optimize the kernel coefficients of the Volterra model is proposed. The second-order Volterra filter (SOVF) speech prediction model based on UPSO is established by using English phonemes, words, and phrases. In order to reduce the complexity of the model, given a user-designed tolerance of errors, we extract the reduced parameter of SOVF (RPSOVF) for acceleration. The experimental results show that in the tasks of single-frame and multiframe speech signals, both UPSO-SOVF and UPSO-RPSOVF are better than LMS-SOVF and PSO-SOVF in terms of root mean square error (RMSE) and mean absolute deviation (MAD). UPSO-SOVF and UPSO-RPSOVF can better reflect trends and regularity of speech signals, which can fully meet the requirements of speech signal prediction. The proposed model presents a nonlinear analysis and valuable model structure for speech signal series, and can be further employed in speech signal reconstruction or compression coding.

[1]  M. Rosenstein,et al.  A practical method for calculating largest Lyapunov exponents from small data sets , 1993 .

[2]  S. D. Hansen,et al.  Non-linear short-term prediction in speech coding , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  L. Cao Practical method for determining the minimum embedding dimension of a scalar time series , 1997 .

[4]  Russell C. Eberhart,et al.  Parameter Selection in Particle Swarm Optimization , 1998, Evolutionary Programming.

[5]  H. Abarbanel,et al.  Distribution of mutual information , 2000, nlin/0011016.

[6]  Maciej Ogorzalek Signal coding and compression based on discrete-time chaos: statistical approaches , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[7]  Régine Le Bouquin-Jeannès,et al.  Nonlinear acoustic echo cancellation based on Volterra filters , 2003, IEEE Trans. Speech Audio Process..

[8]  Iasonas Kokkinos,et al.  Nonlinear speech analysis using models for chaotic systems , 2005, IEEE Transactions on Speech and Audio Processing.

[9]  Zhang Jia-shu,et al.  Local discrete cosine transformation domain Volterra prediction of chaotic time series , 2005 .

[10]  Nengheng Zheng,et al.  Enhancement of Chinese speech based on nonlinear dynamics , 2007, Signal Process..

[11]  Ioannis B. Theocharis,et al.  Pipelined Recurrent Fuzzy Neural Networks for Nonlinear Adaptive Speech Prediction , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Emrah Yürüklü,et al.  Chaotic-Type Features for Speech Steganalysis , 2008, IEEE Transactions on Information Forensics and Security.

[13]  Wei-Der Chang Volterra filter modeling of nonlinear discrete-time system using improved particle swarm optimization , 2012, Digit. Signal Process..

[14]  Eric Grivel,et al.  Estimating second-order Volterra system parameters from noisy measurements based on an LMS variant or an errors-in-variables method , 2012, Signal Process..

[15]  Biing-Hwang Juang,et al.  Recurrent deep neural networks for robust speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Adel M. Alimi,et al.  PSO-based analysis of Echo State Network parameters for time series forecasting , 2017, Appl. Soft Comput..