Robust sequential learning of feedforward neural networks in the presence of heavy-tailed noise

Feedforward neural networks (FFNN) are among the most used neural networks for modeling of various nonlinear problems in engineering. In sequential and especially real time processing all neural networks models fail when faced with outliers. Outliers are found across a wide range of engineering problems. Recent research results in the field have shown that to avoid overfitting or divergence of the model, new approach is needed especially if FFNN is to run sequentially or in real time. To accommodate limitations of FFNN when training data contains a certain number of outliers, this paper presents new learning algorithm based on improvement of conventional extended Kalman filter (EKF). Extended Kalman filter robust to outliers (EKF-OR) is probabilistic generative model in which measurement noise covariance is not constant; the sequence of noise measurement covariance is modeled as stochastic process over the set of symmetric positive-definite matrices in which prior is modeled as inverse Wishart distribution. In each iteration EKF-OR simultaneously estimates noise estimates and current best estimate of FFNN parameters. Bayesian framework enables one to mathematically derive expressions, while analytical intractability of the Bayes' update step is solved by using structured variational approximation. All mathematical expressions in the paper are derived using the first principles. Extensive experimental study shows that FFNN trained with developed learning algorithm, achieves low prediction error and good generalization quality regardless of outliers' presence in training data.

[1]  Ana González-Marcos,et al.  TAO-robust backpropagation learning algorithm , 2005, Neural Networks.

[2]  Zne-Jung Lee,et al.  Hybrid robust support vector machines for regression with outliers , 2011, Appl. Soft Comput..

[3]  Andrzej Rusiecki,et al.  Robust learning algorithm based on LTA estimator , 2013, Neurocomputing.

[4]  Shun-Feng Su,et al.  Robust support vector regression networks for function approximation with outliers , 2002, IEEE Trans. Neural Networks.

[5]  Koichiro Yamauchi,et al.  An online learning algorithm with dimension selection using minimal hyper basis function networks , 2006 .

[6]  Eduardo Mario Nebot,et al.  An outlier-robust Kalman filter , 2011, 2011 IEEE International Conference on Robotics and Automation.

[7]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[8]  Les E. Atlas,et al.  Recurrent neural networks and robust time series prediction , 1994, IEEE Trans. Neural Networks.

[9]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[10]  Chia-Ju Wu,et al.  ARFNNs with SVR for prediction of chaotic time series with outliers , 2010, Expert Syst. Appl..

[11]  Zeljko M. Durovic,et al.  Robust estimation with unknown noise statistics , 1999, IEEE Trans. Autom. Control..

[12]  Branko D. Kova,et al.  Robust Estimation with Unknown Noise Statistics , 1999 .

[13]  Eduardo Mario Nebot,et al.  Robust Estimation in Non-Linear State-Space Models With State-Dependent Noise , 2014, IEEE Transactions on Signal Processing.

[14]  Narasimhan Sundararajan,et al.  A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation , 2005, IEEE Transactions on Neural Networks.

[15]  Chia-Nan Ko,et al.  Identification of nonlinear systems with outliers using wavelet neural networks based on annealing dynamical learning algorithm , 2012, Eng. Appl. Artif. Intell..

[16]  Min Liu,et al.  An incremental extreme learning machine for online sequential learning problems , 2014, Neurocomputing.

[17]  David Barber,et al.  Bayesian reasoning and machine learning , 2012 .

[18]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[19]  Chein-I Chang,et al.  Robust radial basis function neural networks , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[20]  Michel Verleysen,et al.  Robust Bayesian clustering , 2007, Neural Networks.

[21]  Chen-Chia Chuang,et al.  CPBUM neural networks for modeling with outliers and noise , 2007, Appl. Soft Comput..

[22]  S. Mitter,et al.  Robust Recursive Estimation in the Presence of Heavy-Tailed Observation Noise , 1994 .

[23]  Cheng-Han Tsai,et al.  A novel self-constructing Radial Basis Function Neural-Fuzzy System , 2013, Appl. Soft Comput..

[24]  Chin-Wang Tao,et al.  Hybrid SVMR-GPR for modeling of chaotic time series systems with noise and outliers , 2010, Neurocomputing.

[25]  Simo Särkkä,et al.  Recursive Noise Adaptive Kalman Filtering by Variational Bayesian Approximations , 2009, IEEE Transactions on Automatic Control.

[26]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[27]  Jin-Tsong Jeng,et al.  Annealing robust radial basis function networks for function approximation with outliers , 2004, Neurocomputing.

[28]  Bojan Babić,et al.  New hybrid vision-based control approach for automated guided vehicles , 2013 .

[29]  Nando de Freitas,et al.  Robust Full Bayesian Learning for Radial Basis Networks , 2001, Neural Computation.

[30]  V. Yohai,et al.  High Breakdown-Point Estimates of Regression by Means of the Minimization of an Efficient Scale , 1988 .

[31]  Narasimhan Sundararajan,et al.  A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks , 2006, IEEE Transactions on Neural Networks.

[32]  Stefan Schaal,et al.  A Kalman filter for robust outlier detection , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[33]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[34]  Eric C. Rouchka,et al.  Reduced HyperBF Networks: Regularization by Explicit Complexity Reduction and Scaled Rprop-Based Training , 2011, IEEE Transactions on Neural Networks.

[35]  Chien-Cheng Lee,et al.  Noisy time series prediction using M-estimator based robust radial basis function neural networks with growing and pruning techniques , 2009, Expert Syst. Appl..

[36]  Srdjan S. Stankovic,et al.  Analysis of robust stochastic approximation algorithms for process identification , 1986, Autom..

[37]  Najdan Vukovic,et al.  A growing and pruning sequential learning algorithm of hyper basis function neural network for function approximation , 2013, Neural Networks.

[38]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[39]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[40]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[41]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[42]  Andrzej Rusiecki Robust LTS Backpropagation Learning Algorithm , 2007, IWANN.

[43]  Timothy J. Robinson,et al.  Sequential Monte Carlo Methods in Practice , 2003 .

[44]  Michael R. Lyu,et al.  Robust Regularized Kernel Regression , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[45]  D.G. Tzikas,et al.  The variational approximation for Bayesian inference , 2008, IEEE Signal Processing Magazine.

[46]  Nikolay I. Nikolaev,et al.  Sequential Bayesian kernel modelling with non-Gaussian noise , 2008, Neural Networks.

[47]  Yiannis Demiris,et al.  The One-Hidden Layer Non-parametric Bayesian Kernel Machine , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[48]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[49]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[50]  Yiqiang Chen,et al.  TOSELM: Timeliness Online Sequential Extreme Learning Machine , 2014, Neurocomputing.

[51]  Narasimhan Sundararajan,et al.  An efficient sequential learning algorithm for growing and pruning RBF (GAP-RBF) networks , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[52]  Stephen A. Billings,et al.  Generalized multiscale radial basis function networks , 2007, Neural Networks.

[53]  Song-Shyong Chen,et al.  Robust TSK fuzzy modeling for function approximation with outliers , 2001, IEEE Trans. Fuzzy Syst..

[54]  Antonio J. Serrano,et al.  BELM: Bayesian Extreme Learning Machine , 2011, IEEE Transactions on Neural Networks.

[55]  Najdan Vuković Razvoj mašinskog učenja inteligentnog mobilnog robota baziran na sistemu veštačkih neuronskih mreža , 2012 .

[56]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[57]  Jun Wang,et al.  Chaotic Time Series Prediction Based on a Novel Robust Echo State Network , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[58]  Yonggwan Won,et al.  Regularized online sequential learning algorithm for single-hidden layer feedforward neural networks , 2011, Pattern Recognit. Lett..

[59]  Dan Simon,et al.  Training radial basis neural networks with the extended Kalman filter , 2002, Neurocomputing.

[60]  Tomaso A. Poggio,et al.  Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.

[61]  Eduardo Mario Nebot,et al.  Approximate Inference in State-Space Models With Heavy-Tailed Noise , 2012, IEEE Transactions on Signal Processing.