Methodologies for Time Series Prediction and Missing Value Imputation

The amount of collected data is increasing all the time in the world. More sophisticated measuring instruments and increase in the computer processing power produce more and more data, which requires more capacity from the collection, transmission and storage. Even though computers are faster, large databases need also good and accurate methodologies for them to be useful in practice. Some techniques are not feasible to be applied to very large databases or are not able to provide the necessary accuracy. As the title proclaims, this thesis focuses on two aspects encountered with databases, time series prediction and missing value imputation. The first one is a function approximation and regression problem, but can, in some cases, be formulated also as a classification task. Accurate prediction of future values is heavily dependent not only on a good model, which is well trained and validated, but also preprocessing, input variable selection or projection and output approximation strategy selection. The importance of all these choices made in the approximation process increases when the prediction horizon is extended further into the future. The second focus area deals with missing values in a database. The missing values can be a nuisance, but can be also be a prohibiting factor in the use of certain methodologies and degrade the performance of others. Hence, missing value imputation is a very necessary part of the preprocessing of a database. This imputation has to be done carefully in order to retain the integrity of the database and not to insert any unwanted artifacts to aggravate the job of the final data analysis methodology. Furthermore, even though the accuracy is always the main requisite for a good methodology, computational time has to be considered alongside the precision. In this thesis, a large variety of different strategies for output approximation and variable processing for time series prediction are presented. There is also a detailed presentation of new methodologies and tools for solving the problem of missing values. The strategies and methodologies are compared against the state-of-the-art ones and shown to be accurate and useful in practice.

[1]  Guilherme De A. Barreto,et al.  Long-term time series prediction with the NARX network: An empirical evaluation , 2008, Neurocomputing.

[2]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[3]  Amaury Lendasse,et al.  Variable Scaling for Time Series Prediction: Application to the ESTSP'07 and the NN3 Forecasting Competitions , 2007, 2007 International Joint Conference on Neural Networks.

[4]  Amaury Lendasse,et al.  A SOM-based approach to estimating product properties from spectroscopic measurements , 2009, Neurocomputing.

[5]  D. Altman,et al.  Missing data , 2007, BMJ : British Medical Journal.

[6]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[7]  H. Akaike A new look at the statistical model identification , 1974 .

[8]  Michel Verleysen,et al.  Model Selection with Cross-Validations and Bootstraps - Application to Time Series Prediction with RBFN Models , 2003, ICANN.

[9]  Charles M. Bishop Variational principal components , 1999 .

[10]  T. Villmann,et al.  Extensions and modifications of the Kohenen-SOM and applications in remote sensing image analysis , 2001 .

[11]  Sophie Midenet,et al.  Self-Organising Map for Data Imputation and Correction in Surveys , 2002, Neural Computing & Applications.

[12]  Kazuyuki Aihara,et al.  Prediction of Chaotic Time Series with Noise , 1995 .

[13]  Nenad Koncar,et al.  A note on the Gamma test , 1997, Neural Computing & Applications.

[14]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[15]  Amaury Lendasse,et al.  OP-ELM: Optimally Pruned Extreme Learning Machine , 2010, IEEE Transactions on Neural Networks.

[16]  F. Tangang,et al.  Forecasting ENSO Events: A Neural Network–Extended EOF Approach. , 1998 .

[17]  L. N. Kanal,et al.  Handbook of Statistics, Vol. 2. Classification, Pattern Recognition and Reduction of Dimensionality. , 1985 .

[18]  Gianluca Bontempi,et al.  Long Term Time Series Prediction with Multi-Input Multi-Output Local Learning , 2008 .

[19]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Guilherme De A. Barreto,et al.  Adaptive filtering with the self-organizing map: A performance comparison , 2006, Neural Networks.

[21]  Guilherme De A. Barreto,et al.  Time Series Prediction with the Self-Organizing Map: A Review , 2007, Perspectives of Neural-Symbolic Integration.

[22]  E. Rasek A contribution to the problem of feature selection with similarity functionals in pattern recognition , 1971, Pattern Recognit..

[23]  Jacob Zahavi,et al.  Using simulated annealing to optimize the feature selection problem in marketing applications , 2006, Eur. J. Oper. Res..

[24]  Nikolaos Kourentzes,et al.  Feature selection for time series prediction - A combined filter and wrapper approach for neural networks , 2010, Neurocomputing.

[25]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[26]  Andreas S. Weigend,et al.  Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .

[27]  Eric Séverin,et al.  OPELM and OPKNN in long-term prediction of time series using projected input data , 2010, Neurocomputing.

[28]  Janice D. Boyd,et al.  Estimation of EOF expansion coefficients from incomplete data , 1994 .

[29]  Teuvo Kohonen,et al.  Data Management by Self-Organizing Maps , 2008, WCCI.

[30]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[31]  L. Gandin Objective Analysis of Meteorological Fields , 1963 .

[32]  Qiang Shen,et al.  Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring , 2004, Pattern Recognit..

[33]  Amaury Lendasse,et al.  Input and Structure Selection for k-NN Approximator , 2005, IWANN.

[34]  Alexander Barth,et al.  Conclusions References , 2004 .

[35]  Antonia J. Jones,et al.  New tools in non-linear modelling and prediction , 2004, Comput. Manag. Sci..

[36]  Amaury Lendasse,et al.  Autoregressive time series prediction by means of fuzzy inference systems using nonparametric residual variance estimation , 2010, Fuzzy Sets Syst..

[37]  Amaury Lendasse,et al.  Long-term prediction of time series by combining direct and MIMO strategies , 2009, 2009 International Joint Conference on Neural Networks.

[38]  R. Bhansali,et al.  Some properties of the order of an autoregressive model selected by a generalization of Akaike∘s EPF criterion , 1977 .

[39]  Jaakko Hollmén,et al.  Long-term prediction of time series using a parsimonious set of inputs and LS-SVM , 2007 .

[40]  Mauro Birattari,et al.  Lazy Learning Meets the Recursive Least Squares Algorithm , 1998, NIPS.

[41]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[42]  Jerry M. Mendel,et al.  Generating fuzzy rules by learning from examples , 1992, IEEE Trans. Syst. Man Cybern..

[43]  A. Stohl,et al.  Forest climatology: estimation of missing values for Bavaria, Germany , 1999 .

[44]  Amaury Lendasse,et al.  A Methodology for Building Regression Models using Extreme Learning Machine: OP-ELM , 2008, ESANN.

[45]  Mauro Birattari,et al.  Lazy learning for modeling and control design , 1997 .

[46]  J. Beckers,et al.  Reconstruction of incomplete oceanographic data sets using empirical orthogonal functions: application to the Adriatic Sea surface temperature , 2005 .

[47]  Michel Verleysen,et al.  Fast bootstrap methodology for regression model selection , 2005, Neurocomputing.

[48]  Farmer,et al.  Predicting chaotic time series. , 1987, Physical review letters.

[49]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[50]  Giovanni Manzini Perimeter search in restricted memory , 1996 .

[51]  Marie Cottrell,et al.  Missing values : processing with the Kohonen algorithm , 2007, ArXiv.

[52]  J. Doyne Farmer,et al.  Exploiting Chaos to Predict the Future and Reduce Noise , 1989 .

[53]  Michel Verleysen,et al.  On the Effects of Dimensionality on Data Analysis with Neural Networks , 2009, IWANN.

[54]  C. R. Rao,et al.  Generalized Inverse of Matrices and its Applications , 1972 .

[55]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[56]  J. Beckers,et al.  EOF Calculations and Data Filling from Incomplete Oceanographic Datasets , 2003 .

[57]  R. Preisendorfer,et al.  Principal Component Analysis in Meteorology and Oceanography , 1988 .

[58]  D. François High-dimensional data analysis : optimal metrics and feature selection/ , 2007 .

[59]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[60]  Sébastien Massoni,et al.  Career-Path Analysis Using Optimal Matching and Self-Organizing Maps , 2009, WSOM.

[61]  E. M. Wright,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[62]  G. Bierman Factorization methods for discrete sequential estimation , 1977 .

[63]  Fernando José Von Zuben,et al.  Long-term time series prediction using wrappers for variable selection and clustering for data partition , 2007, IJCNN.

[64]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[65]  Amaury Lendasse,et al.  Input Selection for Long-Term Prediction of Time Series , 2005, IWANN.

[66]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[67]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[68]  Yoan Miche,et al.  Tabu Search with Delta Test for Time Series Prediction using OP-KNN , 2008 .

[69]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[70]  H. Wackernagle,et al.  Multivariate geostatistics: an introduction with applications , 1998 .

[71]  Timo Similä,et al.  Multiresponse Sparse Regression with Application to Multidimensional Scaling , 2005, ICANN.

[72]  E. Lorenz Atmospheric Predictability as Revealed by Naturally Occurring Analogues , 1969 .

[73]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[74]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[75]  Michel Verleysen,et al.  The Curse of Dimensionality in Data Mining and Time Series Prediction , 2005, IWANN.

[76]  J. Beckers,et al.  Multivariate reconstruction of missing data in sea surface temperature, chlorophyll, and wind satellite fields , 2007 .

[77]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[78]  Gianluca Bontempi,et al.  Local learning techniques for modeling, prediction and control , 2000 .