Hidden dynamic learning for long-interval consecutive missing values reconstruction in EEG time series

Accurate and reliable estimation of multiple coevolving time series over random long-interval consecutive missing values have become the new frontier of the reconstruction discipline. If the problem of missing values cannot be solved, a significant number of important data sets will be improperly analyzed or discarded as they can distort and repudiate the usage of several methodologies. Conventional interpolation approaches are commonly used to estimate missing values in incomplete time series patterns. However, these methods are ignoring the correlations among multiple dimensions and becoming invalid for long period of missing values. This research therefore proposes a new approach to automatically recover missing values based on applying Linear Dynamical System. The proposed approach captures correlations between multiple coevolving time sequences via identifying a few hidden variables and mining their dynamics to impute missing values. The proposed methodology recovers random consecutive observation of the missing values with low reconstruction errors. Moreover, the proposed method offers a robust and scalable approach with linear computation time over the size of sequences. The proposed method's applicability is demonstrated on real world electroencephalogram (EEG) signals where incomplete data frequently occur due to corrupted transmission of equipment electrodes.

[1]  Antti Sorjamaa,et al.  Methodologies for Time Series Prediction and Missing Value Imputation , 2011 .

[2]  L. Ralaivola,et al.  Time series filtering, smoothing and learning using the kernel Kalman filter , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[3]  Connie Phong,et al.  Missing Value Estimation for Time Series Microarray Data Using Linear Dynamical Systems Modeling , 2008, 22nd International Conference on Advanced Information Networking and Applications - Workshops (aina workshops 2008).

[4]  K Lehnertz,et al.  Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Curtis F. Gerald,et al.  APPLIED NUMERICAL ANALYSIS , 1972, The Mathematical Gazette.

[6]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[7]  Christos Faloutsos,et al.  DynaMMo: mining and summarization of coevolving sequences with missing values , 2009, KDD.

[8]  Tamara G. Kolda,et al.  Scalable Tensor Factorizations with Missing Data , 2010, SDM.

[9]  Tommi S. Jaakkola,et al.  Weighted Low-Rank Approximations , 2003, ICML.

[10]  Dimitrios Gunopulos,et al.  Correlating synchronous and asynchronous data streams , 2003, KDD '03.

[11]  Jimeng Sun,et al.  Streaming Pattern Discovery in Multiple Time-Series , 2005, VLDB.

[12]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[13]  Guodong Liu,et al.  Estimation of missing markers in human motion capture , 2006, The Visual Computer.

[14]  Klaus-Robert Müller,et al.  Classifying Single Trial EEG: Towards Brain Computer Interfacing , 2001, NIPS.

[15]  P. A. Blight The Analysis of Time Series: An Introduction , 1991 .

[16]  Edward Y. Chang,et al.  Adaptive stream resource management using Kalman Filters , 2004, SIGMOD '04.

[17]  Klaus Dorfmüller-Ulhaas Robust Optical User Motion Tracking Using a Kalman Filter , 2005 .