论文信息 - A Note on the Validity of Cross-Validation for Evaluating Time Series Prediction

A Note on the Validity of Cross-Validation for Evaluating Time Series Prediction

One of the most widely used standard procedures for model evaluation in classification and regression is K-fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potential non-stationarity of the data, its application is not straightforward and often omitted by practitioners in favor of an out-of-sample (OOS) evaluation. In this paper, we show that the particular setup in which time series forecasting is usually performed using Machine Learning methods renders the use of standard K-fold CV possible. We present theoretical insights supporting our arguments. Furthermore, we present a simulation study where we show empirically that K-fold CV performs favourably compared to both OOS evaluation and other time-series-specific techniques such as non-dependent cross-validation.

[1] Sylvain Arlot,et al. A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[2] Edmond Chow,et al. A cross-validatory method for dependent data , 1994 .

[3] M. Stone. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[4] Robert M. Kunst. Cross Validation of Prediction Models for Seasonal Time Series by Parametric Bootstrapping , 2016 .

[5] Bogdan Gabrys,et al. Density-Preserving Sampling: Robust and Efficient Alternative to Cross-Validation for Error Estimation , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[6] Richard A. Davis,et al. Time Series: Theory and Methods , 2013 .

[7] Wolfgang Härdle,et al. Nonparametric Curve Estimation from Time Series , 1989 .

[8] José Manuel Benítez,et al. On the usefulness of cross-validation for directional forecast evaluation , 2014, Comput. Stat. Data Anal..

[9] Yuhong Yang,et al. Nonparametric Regression with Correlated Errors , 2001 .

[10] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[11] Agostino Di Ciaccio,et al. Computational Statistics and Data Analysis Measuring the Prediction Error. a Comparison of Cross-validation, Bootstrap and Covariance Penalty Methods , 2022 .

[12] Jeffrey S. Racine,et al. Consistent cross-validatory model-selection for dependent data: hv-block cross-validation , 2000 .

[13] D. Nolan,et al. DATA‐DEPENDENT ESTIMATION OF PREDICTION FUNCTIONS , 1992 .

[14] José Manuel Benítez,et al. On the use of cross-validation for time series predictor evaluation , 2012, Inf. Sci..

[15] Francisco Herrera,et al. Study on the Impact of Partition-Induced Dataset Shift on $k$-Fold Cross-Validation , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[16] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .

[17] Rob J Hyndman,et al. Forecasting with Exponential Smoothing: The State Space Approach , 2008 .

[18] A. McQuarrie,et al. Regression and Time Series Model Selection , 1998 .