Nonstationary time series transformation methods: An experimental review

Abstract Data preprocessing is a crucial step for mining and learning from data, and one of its primary activities is the transformation of data. This activity is very important in the context of time series prediction since most time series models assume the property of stationarity, i.e., statistical properties do not change over time, which in practice is the exception and not the rule in most real datasets. There are several transformation methods designed to treat nonstationarity in time series. However, the choice of a transformation that is appropriate to the adopted data model and to the problem at hand is not a simple task. This paper provides a review and experimental analysis of methods for transformation of nonstationary time series. The focus of this work is to provide a background on the subject and a discussion on their advantages and limitations to the problem of time series prediction. A subset of the reviewed transformation methods is compared through an experimental evaluation using benchmark datasets from time series prediction competitions and other real macroeconomic datasets. Suitable nonstationary time series transformation methods provided improvements of more than 30% in prediction accuracy for half of the evaluated time series and improved the prediction in more than 95% for 10% of the time series. Furthermore, the adoption of a validation phase during model training enables the selection of suitable transformation methods.

[1]  Michael P. Clements,et al.  A companion to economic forecasting , 2004 .

[2]  David F. Hendry,et al.  Robustifying forecasts from equilibrium-correction systems , 2006 .

[3]  Graciela González-Farías,et al.  A dynamic factor model for the Mexican economy: are common trends useful when predicting economic activity? , 2017 .

[4]  René Carmona,et al.  Statistical analysis of financial data in R , 2014 .

[5]  Eric Moulines,et al.  Asymptotic properties of quasi-maximum likelihood estimators in observation-driven time series models , 2017 .

[6]  Pengjian Shang,et al.  Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting , 2017 .

[7]  Yonghui Sun,et al.  A Carbon Price Forecasting Model Based on Variational Mode Decomposition and Spiking Neural Networks , 2016 .

[8]  Keith W. Hipel,et al.  Geophysical model discrimination using the Akaike information criterion , 1981 .

[9]  K. Taylor Summarizing multiple aspects of model performance in a single diagram , 2001 .

[10]  Kishore Kulat,et al.  Analysis of differencing and decomposition preprocessing methods for wind speed prediction , 2018, Appl. Soft Comput..

[11]  Dominique M. Hanssens,et al.  Market Response Models: Econometric and Time Series Analysis , 1989 .

[12]  Rui Li,et al.  Hierarchical decomposition method and combination forecasting scheme for access load on public map service platforms , 2018, Future Gener. Comput. Syst..

[13]  Shufen Liu,et al.  Self-adaptive Processing and Forecasting Algorithm for Univariate Linear Time Series , 2017 .

[14]  Zhaohua Wu,et al.  On the trend, detrending, and variability of nonlinear and nonstationary time series , 2007, Proceedings of the National Academy of Sciences.

[15]  Alex Maynard,et al.  Long Memory Regressors and Predictive Testing: A Two-stage Rebalancing Approach , 2013 .

[16]  Erkki Oja,et al.  Time series prediction competition: The CATS benchmark , 2007, Neurocomputing.

[17]  L. Gil‐Alana,et al.  Nonlinearities and Fractional Integration in the US Unemployment Rate , 2007 .

[18]  S. Lahmiri Interest rate next-day variation prediction based on hybrid feedforward neural network, particle swarm optimization, and multiresolution techniques , 2016 .

[19]  Piotr Fryzlewicz,et al.  Haar–Fisz estimation of evolutionary wavelet spectra , 2006 .

[20]  Robert H. Shumway,et al.  Time series analysis and its applications : with R examples , 2017 .

[21]  Emiliano Carreño Jara Long memory time series forecasting by using genetic programming , 2011, Genetic Programming and Evolvable Machines.

[22]  A. M. Robert Taylor,et al.  HETEROSKEDASTIC TIME SERIES WITH A UNIT ROOT , 2009, Econometric Theory.

[23]  Pilar Poncela,et al.  Determining the number of factors after stationary univariate transformations , 2016, Empirical Economics.

[24]  Niels Haldrup,et al.  Estimation of Fractional Integration in the Presence of Data Noise , 2003, Comput. Stat. Data Anal..

[25]  Seoung Bum Kim,et al.  Time series forecasting based on wavelet filtering , 2015, Expert Syst. Appl..

[26]  Juan Lavista Ferres,et al.  NonSTOP: A NonSTationary Online Prediction Method for Time Series , 2016, IEEE Signal Processing Letters.

[27]  R. A. R. C. Gopura,et al.  Financial forecasting based on artificial neural networks: Promising directions for modeling , 2011, 2011 6th International Conference on Industrial and Information Systems.

[28]  Krisztian Buza,et al.  Time Series Classification and its Applications , 2018, WIMS.

[29]  K. Hasan,et al.  Comparative study of wavelet-ARIMA and wavelet-ANN models for temperature time series data in northeastern Bangladesh , 2017 .

[30]  Luis A. Gil-Alana,et al.  MEASURING THE MEMORY PARAMETER ON SEVERAL TRANSFORMATIONS OF ASSET RETURNS , 2005 .

[31]  Dag Tjøstheim,et al.  Factorizing multivariate time series operators , 1981 .

[32]  Heng Wei,et al.  A novel work zone short-term vehicle-type specific traffic speed prediction model through the hybrid EMD–ARIMA framework , 2016 .

[33]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[34]  A.J. Conejo,et al.  Day-ahead electricity price forecasting using the wavelet transform and ARIMA models , 2005, IEEE Transactions on Power Systems.

[35]  Haruna Chiroma,et al.  A Review on Artificial Intelligence Methodologies for the Forecasting of Crude Oil Price , 2016, Intell. Autom. Soft Comput..

[36]  Kyungduk Ko,et al.  Bayesian Wavelet-Based Methods for the Detection of Multiple Changes of the Long Memory Parameter , 2006, IEEE Transactions on Signal Processing.

[37]  T. K. Basu,et al.  Medium range forecasting of monthly energy demand via Walsh transform , 1992 .

[38]  Kung-Sik Chan,et al.  Time Series Analysis: With Applications in R , 2010 .

[39]  Luis A. Gil-Alana,et al.  Modelling the US real GNP with fractionally integrated techniques , 2004 .

[40]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[41]  Mohd Tahir Ismail,et al.  A new hybrid approach EMD-EXP for short-term forecasting of daily stock market time series data , 2017 .

[42]  B. Abraham,et al.  Ch. 29. Time series in industry and business , 2003 .

[43]  Gang Chen,et al.  Prediction model of non-stationary time series parameters for a complex blending process , 2014, The 26th Chinese Control and Decision Conference (2014 CCDC).

[44]  Cornelis A. Los,et al.  Financial Market Risk: Measurement and Analysis , 2006 .

[45]  D. Percival,et al.  A Wavelet Variance Primer , 2012 .

[46]  J. Fox,et al.  Applied Regression Analysis and Generalized Linear Models , 2008 .

[47]  Grzegorz Dudek,et al.  Neural networks for pattern-based short-term load forecasting: A comparative study , 2016, Neurocomputing.

[48]  Holger R. Maier,et al.  Neural network based modelling of environmental variables: A systematic approach , 2001 .

[49]  Allen K. Lynch,et al.  Long-Term Dependency Structure and Structural Breaks: Evidence from the U.S. Sector Returns and Volatility , 2018 .

[50]  Junwei Gao,et al.  Traffic flow forecasting based on wavelet neural network optimized by GA , 2013, Proceedings of the 32nd Chinese Control Conference.

[51]  George Athanasopoulos,et al.  Forecasting: principles and practice , 2013 .

[52]  Acácio M. O. Porta Nova,et al.  Analysis of nonstationary stochastic simulations using classical time-series models , 2009, TOMC.

[53]  Bovas Abraham,et al.  George Box's contributions to time series analysis and forecasting , 2014 .

[54]  Luis A. Gil-Alana,et al.  THE PURCHASING POWER PARITY HYPOTHESIS IN THE US–CHINA RELATIONSHIP: FRACTIONAL INTEGRATION, TIME VARIATION AND DATA FREQUENCY , 2013 .

[55]  Carlos Agón,et al.  Time-series data mining , 2012, CSUR.

[56]  Jie Cao,et al.  A multivariate short-term traffic flow forecasting method based on wavelet analysis and seasonal time series , 2018, Applied Intelligence.

[57]  Alexandros E. Milionis,et al.  The importance of variance stationarity in economic time series modelling. A practical approach , 2004 .

[58]  Richard A. Johnson,et al.  A new family of power transformations to improve normality or symmetry , 2000 .

[59]  Claudio Morana,et al.  Multivariate modelling of long memory processes with common components , 2007, Comput. Stat. Data Anal..

[60]  Richard T. Baillie,et al.  Long memory processes and fractional integration in econometrics , 1996 .

[61]  Arash Jamalmanesh,et al.  Prediction of Hydropower Energy Price Using Gómes-Maravall Seasonal Model , 2018 .

[62]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[63]  G. Caporale,et al.  Long Memory in UK Real GDP, 1851-2013: An ARFIMA-FIGARCH Analysis , 2014 .

[64]  Nejat Yumusak,et al.  Year Ahead Demand Forecast of City Natural Gas Using Seasonal Time Series Methods , 2016 .

[65]  Fábio Porto,et al.  A framework for benchmarking machine learning methods using linear models for univariate time series prediction , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[66]  Pengjian Shang,et al.  Detecting intrinsic dynamics of traffic flow with recurrence analysis and empirical mode decomposition , 2017 .

[67]  Abdourrahmane M. Atto,et al.  Wavelet Packets of Nonstationary Random Processes: Contributing Factors for Stationarity and Decorrelation , 2012, IEEE Transactions on Information Theory.

[68]  Oscar Claveria,et al.  Forecasting tourism demand to Catalonia: Neural networks vs. time series models , 2014 .

[69]  D. Piccolo,et al.  Maximum likelihood estimation of ARFIMA models with a Box-Cox transformation , 2004 .

[70]  Nibaldo Rodríguez,et al.  A Novel Multilevel-SVD Method to Improve Multistep Ahead Forecasting in Traffic Accidents Domain , 2017, Comput. Intell. Neurosci..

[71]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[72]  Hema A. Murthy,et al.  Decoupling non-stationary and stationary components in long range network time series in the context of anomaly detection , 2012, 37th Annual IEEE Conference on Local Computer Networks.

[73]  C. Granger,et al.  Properties of Nonlinear Transformations of Fractionally Integrated Processes , 2000 .

[74]  Saifur Rahman,et al.  Analysis and Evaluation of Five Short-Term Load Forecasting Techniques , 1989, IEEE Power Engineering Review.

[75]  Hee-Seok Oh,et al.  EMD: A Package for Empirical Mode Decomposition and Hilbert Spectrum , 2009 .

[76]  Yang Xiang,et al.  Geographic spatiotemporal big data correlation analysis via the Hilbert-Huang transformation , 2017, J. Comput. Syst. Sci..

[77]  Satish T. S. Bukkapatnam,et al.  Time series forecasting for nonlinear and non-stationary processes: a review and comparative study , 2015 .

[78]  Marina Vannucci,et al.  Bayesian wavelet analysis of autoregressive fractionally integrated moving-average processes , 2006 .

[79]  Marta Mattoso,et al.  Adaptive Normalization: A novel data normalization approach for non-stationary time series , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[80]  Anthony Brockwell,et al.  Likelihood‐based Analysis of a Class of Generalized Long‐Memory Time Series Models , 2007 .

[81]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[82]  K. Minu,et al.  Wavelet Neural Networks for Nonlinear Time Series Analysis , 2010 .

[83]  Chao Liu,et al.  Wind farm power prediction based on wavelet decomposition and chaotic time series , 2011, Expert Syst. Appl..

[84]  R. Fildes,et al.  Measuring forecasting accuracy : the case of judgmental adjustments to SKU-level demand forecasts , 2013 .

[85]  Satyabroto Sinha,et al.  Medium range forecasting of power system load (energy) demand , 1987 .

[86]  Gabriel Rodríguez,et al.  Selecting between Autoregressive Conditional Heteroskedasticity Models: An Empirical Application to the Volatility of Stock Returns in Peru , 2017 .

[87]  Emanuela Marrocu,et al.  An Investigation of the Effects of Data Transformation on Nonlinearity , 2006 .

[88]  Ruey S. Tsay,et al.  Analysis of Financial Time Series , 2005 .

[89]  Frederico G. Guimarães,et al.  Combining ARFIMA models and fuzzy time series for the forecast of long memory time series , 2016, Neurocomputing.

[90]  D. Nachane,et al.  Forecasting interest rates: a comparative assessment of some second-generation nonlinear models , 2008 .

[91]  Nalini Ravishanker,et al.  Fast Bayesian Estimation for VARFIMA Processes with Stable Errors , 2010 .

[92]  T. Sapatinas,et al.  A Haar-Fisz technique for locally stationary volatility estimation , 2006 .