AGGREGATION OF PREDICTORS FOR NON STATIONARY SUB-LINEAR PROCESSES AND ONLINE ADAPTIVE FORECASTING OF TIME VARYING AUTOREGRESSIVE PROCESSES

In this work, we study the problem of aggregating a finite number of predictors for non stationary sub-linear processes. We provide oracle inequalities relying essentially on three ingredients: 1) a uniform bound of the $\ell^1$ norm of the time-varying sub-linear coefficients, 2) a Lipschitz assumption on the predictors and 3) moment conditions on the noise appearing in the linear representation. Two kinds of aggregations are considered giving rise to different moment conditions on the noise and more or less sharp oracle inequalities. We apply this approach for deriving an adaptive predictor for locally stationary time varying autoregressive (TVAR) processes. It is obtained by aggregating a finite number of well chosen predictors, each of them enjoying an optimal minimax convergence rate under specific smoothness conditions on the TVAR coefficients. We show that the obtained aggregated predictor achieves a minimax rate while adapting to the unknown smoothness. To prove this result, a lower bound is established for the minimax rate of the prediction risk for the TVAR process. Numerical experiments complete this study. An important feature of this approach is that the aggregated predictor can be computed recursively and is thus applicable in an online prediction context.

[1]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[2]  P. Priouret,et al.  On recursive estimation for time varying autoregressive processes , 2005, math/0603047.

[3]  R. Dahlhaus On the Kullback-Leibler information divergence of locally stationary processes , 1996 .

[4]  Jean-Yves Audibert Fast learning rates in statistical inference through aggregation , 2007, math/0703854.

[5]  Sébastien Gerchinovitz Prediction of individual sequences and prediction in the statistical framework : some links around sparse regression and aggregation techniques , 2011 .

[6]  H. Tong,et al.  Threshold Autoregression, Limit Cycles and Cyclical Data , 1980 .

[7]  Arnak S. Dalalyan,et al.  Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity , 2008, Machine Learning.

[8]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[9]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[10]  P. Doukhan,et al.  WEAKLY DEPENDENT CHAINS WITH INFINITE MEMORY , 2007, 0712.3231.

[11]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .

[12]  Shie Mannor,et al.  Online Learning for Time Series Prediction , 2013, COLT.

[13]  A. Juditsky,et al.  Functional aggregation for nonparametric regression , 2000 .

[14]  R. Dahlhaus,et al.  Empirical spectral processes for locally stationary time series , 2009, 0902.1448.

[15]  Yuhong Yang COMBINING FORECASTING PROCEDURES: SOME THEORETICAL RESULTS , 2004, Econometric Theory.

[16]  R. Dahlhaus Local inference for locally stationary time series based on the empirical spectral measure , 2009 .

[17]  Pierre Alquier,et al.  Model selection for weakly dependent time series forecasting , 2009, 0902.2924.

[18]  O. Lepskii On a Problem of Adaptive Estimation in Gaussian White Noise , 1991 .

[19]  Yuhong Yang,et al.  Adaptive minimax regression estimation over sparse lq-hulls , 2014, J. Mach. Learn. Res..

[20]  Andrew R. Barron,et al.  Information Theory and Mixing Least-Squares Regressions , 2006, IEEE Transactions on Information Theory.

[21]  A. Tsybakov,et al.  Sparse Estimation by Exponential Weighting , 2011, 1108.5116.

[22]  Alexandre B. Tsybakov,et al.  Optimal Rates of Aggregation , 2003, COLT.

[23]  J. Doob Stochastic processes , 1953 .

[24]  Yuhong Yang Combining Different Procedures for Adaptive Regression , 2000, Journal of Multivariate Analysis.

[25]  R. Dahlhaus,et al.  On the Optimal Segment Length for Parameter Estimates for Locally Stationary Time Series , 1998 .

[26]  Olivier Catoni,et al.  Statistical learning theory and stochastic optimization , 2004 .

[27]  R. Dahlhaus,et al.  Nonparametric quasi-maximum likelihood estimation for Gaussian locally stationary processes , 2006, 0708.0143.

[28]  Ouerdia Arkoun Sequential Adaptive Estimators in Nonparametric Autoregressive Models , 2010, 1004.5199.

[29]  Yves Grenier,et al.  Time-dependent ARMA modeling of nonstationary signals , 1983 .

[30]  Alessio Sancetta RECURSIVE FORECAST COMBINATION FOR DEPENDENT HETEROGENEOUS DATA , 2009, Econometric Theory.

[31]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[32]  A. V. D. Vaart,et al.  Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities , 2001 .

[33]  Yuhong Yang Mixing Strategies for Density Estimation , 2000 .

[34]  O. Catoni The Mixture Approach to Universal Model Selection , 1997 .