Highly efficient hierarchical online nonlinear regression using second order methods

Highly efficient sequential nonlinear regression algorithms are proposed.Piecewise linear models are used for the nonlinear modeling.Region boundaries are continuously updated according to the data statistics.Second order NewtonRaphson methods are used for the adaptation of boundaries. We introduce highly efficient online nonlinear regression algorithms that are suitable for real life applications. We process the data in a truly online manner such that no storage is needed, i.e., the data is discarded after being used. For nonlinear modeling we use a hierarchical piecewise linear approach based on the notion of decision trees where the space of the regressor vectors is adaptively partitioned based on the performance. As the first time in the literature, we learn both the piecewise linear partitioning of the regressor space as well as the linear models in each region using highly effective second order methods, i.e., NewtonRaphson Methods. Hence, we avoid the well known over fitting issues by using piecewise linear models, however, since both the region boundaries as well as the linear models in each region are trained using the second order methods, we achieve substantial performance compared to the state of the art. We demonstrate our gains over the well known benchmark data sets and provide performance results in an individual sequence manner guaranteed to hold without any statistical assumptions. Hence, the introduced algorithms address computational complexity issues widely encountered in real life applications while providing superior guaranteed performance in a strong deterministic sense.

[1]  Mérouane Debbah,et al.  Signal Processing in Large Systems: A New Paradigm , 2011, IEEE Signal Processing Magazine.

[2]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[3]  Ali H. Sayed,et al.  Fundamentals Of Adaptive Filtering , 2003 .

[4]  F. R. Rosendaal,et al.  Prediction , 2015, Journal of thrombosis and haemostasis : JTH.

[5]  Linhua Deng,et al.  Long-term trend in non-stationary time series with nonlinear analysis techniques , 2013, 2013 6th International Congress on Image and Signal Processing (CISP).

[6]  Xu Qian,et al.  Supervised Non-Linear Dimensionality Reduction Techniques for Classification in Intrusion Detection , 2010, 2010 International Conference on Artificial Intelligence and Computational Intelligence.

[7]  Wei Cao,et al.  Coupled market behavior based financial crisis detection , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[8]  Robert Nowak,et al.  Multiscale generalised linear models for nonparametric function estimation , 2005 .

[9]  Robert E. Schapire,et al.  Predicting Nearly As Well As the Best Pruning of a Decision Tree , 1995, COLT '95.

[10]  Xiaodong Wang,et al.  Sequential Distributed Detection in Energy-Constrained Wireless Sensor Networks , 2013, IEEE Transactions on Signal Processing.

[11]  J. Franckel,et al.  5. De la couleur des prépositions dans leurs emplois fonctionnels , 2006 .

[12]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[13]  Andrew C. Singer,et al.  Universal linear least squares prediction: Upper and lower bounds , 2002, IEEE Trans. Inf. Theory.

[14]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[15]  Suleyman Serdar Kozat,et al.  A Comprehensive Approach to Universal Piecewise Nonlinear Regression Based on Trees , 2013, IEEE Transactions on Signal Processing.

[16]  Sanjoy Dasgupta,et al.  Random projection trees for vector quantization , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[17]  Léon Bottou,et al.  On-line learning for very large data sets , 2005 .

[18]  George Karypis,et al.  NLMF: NonLinear Matrix Factorization Methods for Top-N Recommender Systems , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[19]  Alberto Carini,et al.  Fourier nonlinear filters , 2014, Signal Process..

[20]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[21]  Suleyman Serdar Kozat,et al.  Competitive Randomized Nonlinear Prediction Under Additive Noise , 2010, IEEE Signal Processing Letters.

[22]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[23]  Tsachy Weissman,et al.  Universal FIR MMSE Filtering , 2009, IEEE Transactions on Signal Processing.

[24]  Andrew C. Singer,et al.  Universal linear prediction by model order weighting , 1999, IEEE Trans. Signal Process..

[25]  Tamás Linder,et al.  Efficient adaptive algorithms and minimax bounds for zero-delay lossy source coding , 2004, IEEE Transactions on Signal Processing.

[26]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[27]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[28]  Andrew C. Singer,et al.  A tree-weighting approach to sequential decision problems with multiplicative loss , 2011, Signal Process..

[29]  Georg Zeitler,et al.  Universal Piecewise Linear Prediction Via Context Trees , 2007, IEEE Transactions on Signal Processing.

[30]  Edoardo Amaldi,et al.  A new approach to piecewise linear modeling of time series , 1996, 1996 IEEE Digital Signal Processing Workshop Proceedings.

[31]  Andrew C. Singer,et al.  Nonlinear Autoregressive Modeling and Estimation in the Presence of Noise , 1994 .

[32]  M. Schetzen The Volterra and Wiener Theories of Nonlinear Systems , 1980 .

[33]  Alfred O. Hero,et al.  Tree-structured nonlinear signal modeling and prediction , 1999, IEEE Trans. Signal Process..

[34]  Jimmy J. Lin,et al.  Runtime Optimizations for Tree-Based Machine Learning Models , 2014, IEEE Transactions on Knowledge and Data Engineering.

[35]  Danilo Comminiello,et al.  Nonlinear spline adaptive filtering , 2013, Signal Process..

[36]  Ali H. Sayed,et al.  Steady-State MSE Performance Analysis of Mixture Approaches to Adaptive Filtering , 2010, IEEE Transactions on Signal Processing.

[37]  Tomy Varghese,et al.  Slope estimation in noisy piecewise linear functions , 2015, Signal Process..

[38]  Frans M. J. Willems,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.