Substitution of G.728 vocoder’s codebook search module with SOM array trained by PSO-optimized supervised algorithm

Low delay-code excited linear prediction (LD-CELP) is an attractive algorithm in implementing vocoders in voice over Internet protocol networks. This algorithm has been proposed for the coding of speech at 16 kbps with toll quality. However, operation at transmission rates lower than 16 kbps is desirable, so that traffic can be accommodated during system overload conditions. In this paper, an array of self-organizing maps (SOMs) is employed instead of traditional codebook search module, recommended in ITU-T G.728, to determine the optimum index value of shape codebook. It is noted that a modified supervised training algorithm is used for SOMs in which some of the training parameters are optimized using particle swarm optimization (PSO) algorithm. Based on the occurrence frequency characteristics of codevectors, six bits for shape codebook and two bits for gain codebook are used in this work to produce a vocoder with lower bit rate as compared with traditional ITU-T G.728 vocoder. The performance comparison of the proposed SOM array trained by PSO-optimized supervised algorithm as the codebook search module in the structure of LD-CELP with a conventional implementation of LD-CELP coder shows that execution time of the algorithm is reduced up to 44 %. However, the degradation of voice quality in terms of mean opinion score, perceived evaluation of speech quality and segmental signal-to-noise ratio (SNRseg) is acceptable.

[1]  Ganesh K. Venayagamoorthy,et al.  Quantum inspired PSO for the optimization of simultaneous recurrent neural networks as MIMO learning systems , 2010, Neural Networks.

[2]  Ehab Al-Shaer,et al.  On the impact of loss and delay variation on Internet packet audio transmission , 2006, Comput. Commun..

[3]  Mansour Sheikhan,et al.  Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network , 2011, Neural Computing and Applications.

[4]  Sihem Ben Sassi,et al.  Neural speech synthesis system for Arabic language using CELP algorithm , 2001, Proceedings ACS/IEEE International Conference on Computer Systems and Applications.

[5]  Mousa Al-Akhras,et al.  Non-intrusive speech quality prediction in VoIP networks using a neural network approach , 2009, Neurocomputing.

[6]  Melody Y. Kiang,et al.  A two-stage clustering approach for multi-region segmentation , 2010, Expert Syst. Appl..

[7]  S. Moorthi,et al.  Implementation of hybrid ANN-PSO algorithm on FPGA for harmonic estimation , 2012, Eng. Appl. Artif. Intell..

[8]  Sheng Chen,et al.  A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems , 2011, Neurocomputing.

[9]  Mansour Sheikhan,et al.  Using DTW neural–based MFCC warping to improve emotional speech recognition , 2011, Neural Computing and Applications.

[10]  Mansour Sheikhan,et al.  PSO-optimized modular neural network trained by OWO-HWO algorithm for fault location in analog circuits , 2012, Neural Computing and Applications.

[11]  C. C. Goodyear,et al.  A CELP codebook and search technique using a Hopfield net , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[12]  Mansour Sheikhan,et al.  PROSODY GENERATION IN FARSI LANGUAGE , 2003 .

[13]  Tetsuo Furukawa,et al.  SOM of SOMs , 2009, Neural Networks.

[14]  Robert C. Green,et al.  Training neural networks using Central Force Optimization and Particle Swarm Optimization: Insights and comparisons , 2012, Expert Syst. Appl..

[15]  Mansour Sheikhan,et al.  RBF neural network based PI pitch controller for a class of 5-MW wind turbines using particle swarm optimization algorithm. , 2012, ISA transactions.

[16]  Xinggao Liu,et al.  Melt index prediction by RBF neural network optimized with an MPSO-SA hybrid algorithm , 2011, Neurocomputing.

[17]  Dong-Chul Park,et al.  A New Vocoder based on AMR 7.4kbit/s Mode in Speaker Dependent Coding System , 2008, 2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing.

[18]  Yung C. Shin,et al.  Constructive training of recurrent neural networks using hybrid optimization , 2010, Neurocomputing.

[19]  Gang Zhang,et al.  The LD-CELP Gain Filter Based on BP Neural Network , 2006, ISNN.

[20]  Manojit Chattopadhyay,et al.  Application of visual clustering properties of self organizing map in machine-part cell formation , 2012, Appl. Soft Comput..

[21]  Francisco Javier de Cos Juez,et al.  A hybrid device for the solution of sampling bias problems in the forecasting of firms' bankruptcy , 2012, Expert Syst. Appl..

[22]  Ryotaro Kamimura,et al.  Supposed maximum information for comprehensible representations in SOM , 2011, Neurocomputing.

[23]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[24]  Reza Safabakhsh,et al.  A new active contour model based on the Conscience, Archiving and Mean-Movement mechanisms and the SOM , 2011, Pattern Recognit. Lett..

[25]  Mohammad Mehdi Ebadzadeh,et al.  A novel hybrid algorithm for creating self-organizing fuzzy neural networks , 2009, Neurocomputing.

[26]  Vahid Tabataba Vakili,et al.  Complexity Reduction of LD-CELP Speech Coding in Prediction of Gain Using Neural Networks , 2009 .

[27]  Akira Naito,et al.  A Variable Bit-Rate LD-CELP Speech Coder at 16, 12.8 and 9.6 kbit/s , 1995, Proceedings. IEEE Workshop on Speech Coding for Telecommunications.

[28]  Xueying Zhang,et al.  A LD-aCELP Speech Coding Algorithm Based on Modified SOFM Vector Quantizer , 2008, 2008 International Symposium on Intelligent Information Technology Application Workshops.

[29]  Marcian N. Cirstea,et al.  SOM neural network design - A new Simulink library based approach targeting FPGA implementation , 2013, Math. Comput. Simul..

[30]  Sung-Bae Cho,et al.  An improved swarm optimized functional link artificial neural network (ISO-FLANN) for classification , 2012, J. Syst. Softw..

[31]  Mansour Sheikhan,et al.  Codebook Search in LD-CELP Speech Coding Algorithm Based on Multi-SOM Structure , 2009 .

[32]  Russell C. Eberhart,et al.  Parameter Selection in Particle Swarm Optimization , 1998, Evolutionary Programming.

[33]  Abbas Vafaei,et al.  Color reduction using a multi-stage Kohonen Self-Organizing Map with redundant features , 2011, Expert Syst. Appl..

[34]  E. Lopez-Gonzalo,et al.  Phonetically-driven CELP coding using self-organizing maps , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[35]  Mansour Sheikhan,et al.  Emotion recognition improvement using normalized formant supplementary features by hybrid of DTW-MLP-GMM model , 2012, Neural Computing and Applications.

[36]  Witold Kinsner,et al.  A neural network mapper for stochastic code book parameter encoding in code-excited linear predictive speech processing , 1991, [Proceedings] WESCANEX '91.

[37]  Teuvo Kohonen,et al.  Self-Organizing Maps, Third Edition , 2001, Springer Series in Information Sciences.

[38]  Ah Chung Tsoi,et al.  A self-organizing map for adaptive processing of structured data , 2003, IEEE Trans. Neural Networks.

[39]  E. Arsuaga Uriarte,et al.  Topology Preservation in SOM , 2008 .

[40]  Chung-Chian Hsu,et al.  A self-organizing map for transactional data and the related categorical domain , 2012, Appl. Soft Comput..

[41]  Yuanchao Liu,et al.  Research of fast SOM clustering for text information , 2011, Expert Syst. Appl..

[42]  Reza Shahnazi,et al.  Hyperchaos synchronization using PSO-optimized RBF-based controllers to improve security of communication systems , 2011, Neural Computing and Applications.

[43]  Arash Ghanbari,et al.  Integration of genetic fuzzy systems and artificial neural networks for stock price forecasting , 2010, Knowl. Based Syst..

[44]  Xueying Jiang,et al.  Application of Improved SOM Neural Network in Anomaly Detection , 2012 .

[45]  Tommy W. S. Chow,et al.  PolSOM: A new method for multidimensional data visualization , 2010, Pattern Recognit..

[46]  Hai V. Pham,et al.  Hybrid Kansei-SOM model using risk management and company assessment for stock trading , 2014, Inf. Sci..

[47]  Wei-Shen Tai,et al.  Growing Self-Organizing Map with cross insert for mixed-type data clustering , 2012, Appl. Soft Comput..

[48]  Jane You,et al.  Visual query processing for efficient image retrieval using a SOM-based filter-refinement scheme , 2012, Inf. Sci..

[49]  Mansour Sheikhan,et al.  State of charge neural computational models for high energy density batteries in electric vehicles , 2012, Neural Computing and Applications.

[50]  Marcos Faúndez-Zanuy Adaptive Hybrid Speech Coding with a MLP/LPC Structure , 1999, IWANN.

[51]  Nithiroth Pornsuwancharoen,et al.  A new technique Gray scale display of input data using shooting SOM and genetic algorithm , 2012 .

[52]  Liang Zhao,et al.  Tuning the structure and parameters of a neural network using cooperative binary-real particle swarm optimization , 2011, Expert Syst. Appl..

[53]  Mansour Sheikhan,et al.  PSO-Optimized Hopfield Neural Network-Based Multipath Routing for Mobile Ad-hoc Networks , 2017, Int. J. Comput. Intell. Syst..

[54]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[55]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[56]  Jorge M. L. Gorricha,et al.  Improvements on the visualization of clusters in geo-referenced data using Self-Organizing Maps , 2012, Comput. Geosci..

[57]  Masoud Yaghini,et al.  A hybrid algorithm for artificial neural network training , 2013, Eng. Appl. Artif. Intell..

[58]  Yen-Chun Lin,et al.  A Low-Delay CELP Coder for the CCITT 16 kb/s Speech Coding Standard , 1992, IEEE J. Sel. Areas Commun..

[59]  Ah Chung Tsoi,et al.  A Supervised Self-Organizing Map for Structured Data , 2001, WSOM.

[60]  Mansour Sheikhan,et al.  Modular neural-SVM scheme for speech emotion recognition using ANOVA feature selection method , 2013, Neural Computing and Applications.

[61]  Ryotaro Kamimura,et al.  Relative information maximization and its application to the extraction of explicit class structure in SOM , 2012, Neurocomputing.

[62]  M. Paez,et al.  Minimum Mean-Squared-Error Quantization in Speech PCM and DPCM Systems , 1972, IEEE Trans. Commun..

[63]  Ma Zhao-yang,et al.  Reducing the complexity of LD-CELP speech coding algorithm using direct vector quantization , 2008, 2008 International Conference on Communications, Circuits and Systems.

[64]  Pierpaolo D'Urso,et al.  Temporal self-organizing maps for telecommunications market segmentation , 2008, Neurocomputing.

[66]  Wai Keung Wong,et al.  A hybrid particle swarm optimization and its application in neural networks , 2012, Expert Syst. Appl..

[67]  Mansour Sheikhan,et al.  Improved contourlet-based steganalysis using binary particle swarm optimization and radial basis neural networks , 2011, Neural Computing and Applications.

[68]  Lifeng Xi,et al.  Evolving artificial neural networks using an improved PSO and DPSO , 2008, Neurocomputing.

[69]  Tommy W. S. Chow,et al.  PPoSOM: A new variant of PolSOM by using probabilistic assignment for multidimensional data visualization , 2011, Neurocomputing.

[70]  Ehsanollah Kabir,et al.  A PSO-based weighting method for linear combination of neural networks , 2010, Comput. Electr. Eng..

[71]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[72]  S. Abdelhak,et al.  Application of Multi-SOM clustering approach to macrophage gene expression analysis. , 2009, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[73]  Martin Birgmeier,et al.  Nonlinear prediction of speech signals using radial basis function networks , 1996, 1996 8th European Signal Processing Conference (EUSIPCO 1996).

[74]  Marcos Faúndez-Zanuy,et al.  Non-linear Speech Coding with MLP, RBF and Elman Based Prediction , 2009, IWANN.

[75]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[76]  M. H. Ghaseminezhad,et al.  A novel self-organizing map (SOM) neural network for discrete groups of data clustering , 2011, Appl. Soft Comput..

[77]  Luis Fernando de Mingo López,et al.  The optimal combination: Grammatical swarm, particle swarm optimization and neural networks , 2012, J. Comput. Sci..

[78]  Lianggui Feng,et al.  A novel neural-network approach of analog fault diagnosis based on kernel discriminant analysis and particle swarm optimization , 2012, Appl. Soft Comput..

[79]  Reza Shahnazi,et al.  PSO-RBF Based control Schema for Adaptive Active Queue Management in TCP Networks , 2017 .

[80]  Stoyan Tanev,et al.  Neural Networks to model the innovativeness perception of co-creative firms , 2012, Expert Syst. Appl..

[81]  Tetsuo Furukawa,et al.  Modular network SOM , 2009, Neural Networks.

[82]  Mansour Sheikhan,et al.  Time series prediction using PSO-optimized neural network and hybrid feature selection algorithm for IEEE load data , 2012, Neural Computing and Applications.

[83]  Mansour Sheikhan,et al.  Continuous speech recognition and syntactic processing in Iranian Farsi language , 1997, Int. J. Speech Technol..

[84]  Chia-Nan Ko,et al.  Time series prediction using RBF neural networks with a nonlinear time-varying evolution PSO algorithm , 2009, Neurocomputing.

[85]  Mansour Sheikhan,et al.  Reducing the Codebook Search Time in G.728 Speech Coder Using Fuzzy ARTMAP Neural Networks , 2010 .

[86]  Dao-Qing Dai,et al.  An adaptive spatial clustering method for automatic brain MR image segmentation , 2009 .

[87]  Michael R. Lyu,et al.  A hybrid particle swarm optimization-back-propagation algorithm for feedforward neural network training , 2007, Appl. Math. Comput..

[88]  Cihan Karakuzu,et al.  Neural identification of dynamic systems on FPGA with improved PSO learning , 2012, Appl. Soft Comput..

[89]  Joel Max,et al.  Quantizing for minimum distortion , 1960, IRE Trans. Inf. Theory.

[90]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[91]  Gang Feng,et al.  Robust vector quantizer design using self-organizing neural networks , 2000, Signal Process..