Introduction to Time Series Analysis and Forecasting
1. Introduction to Forecasting. 1.1 The Nature and uses of Forecasts. 1.2 Some Examples of Time Series. 1.3 The Forecasting Process. 1.4 Resources for Forecasting. 2. Statistics Background for Forecasting. 2.1 Introduction. 2.2 Graphical Displays. 2.3 Numerical Description of Time Series Data. 2.4 Use of Data Transformations and Adjustments. 2.5 General Approach to Time Series Analysis and Forecasting. 2.6 Evaluating and Monitoring Forecasting Model Performance. 3. Regression Analysis and Forecasting. 3.1 Introduction. 3.2 Least Squares Estimation in Linear Regression Models. 3.3 Statistical Inference in Linear Regression. 3.4 Prediction of New Observations. 3.5 Model Adequacy Checking. 3.6 Variable Selection Methods in Regression. 3.7 Generalized and Weighted Least Squares. 3.8 Regression Models for General Time Series Data. 4. Exponential Smoothing Methods. 4.1 Introduction. 4.2 First-Order Exponential Smoothing. 4.3 Modeling Time series Data. 4.4 Second-Order Exponential Smoothing. 4.5 Higher-Order Exponential Smoothing. 4.6 Forecasting. 4.7 Exponential Smoothing for Seasonal Data. 4.8 Exponential Smoothers and ARIMA Models. 5. Autoregressive Integrated Moving Average (ARIMA) Models. 5.1 Introduction. 5.2 Linear Models for Stationary Time Series. 5.3 Finite Order Moving Average (MA) Processes. 5.4 Finite Order Autoregressive Processes. 5.5 Mixed Autoregressive-Moving Average (ARMA) Processes. 5.6 Non-stationary Processes. 5.7 Time Series Model Building . 5.8 Forecasting ARIMA Processes . 5.9 Seasonal Processes. 5.10 Final Comments. 6. Transfer Function and Intervention Models. 6.1 Introduction. 6.2 Transfer Function Models. 6.3 Transfer Function-Noise Models. 6.4 Cross Correlation Function. 6.5 Model Specification. 6.6 Forecasting with Transfer Function-Noise Models. 6.7 Intervention Analysis. 7. Survey of Other Forecasting Methods. 7.1 Multivariate Time Series Models and Forecasting. 7.2 State Space Models. 7.3 ARCH and GARCH Models. 7.4 Direct Forecasting of Percentiles. 7.5 Combining Forecasts to Improve Prediction Performance. 7.6 Aggregation and Disaggregation of Forecasts. 7.7 Neural Networks and Forecasting. 7.8 Some Comments on Practical Implementation and use of Statistical Forecasting Techniques. Bibliography. Appendix. Appendix A Statistical Tables. Table A.1 Cumulative Normal Distribution. Table A.2 Percentage Points of the Chi-Square Distribution. Table A.3 Percentage Points of the t Distribution. Table A.4 Percentage Points of the F Distribution. Table A.5 Critical Values of the Durbin-Watson Statistic. Appendix B Data Sets for Exercises. Table B.1 Market Yield on U.S. Treasury Securities at 10-year Constant Maturity. Table B.2 Pharmaceutical Product Sales. Table B.3 Chemical Process Viscosity. Table B.4 U.S Production of Blue and Gorgonzola Cheeses. Table B.5 U.S. Beverage Manufacturer Product Shipments, Unadjusted. Table B.6 Global Mean Surface Air Temperature Anomaly and Global CO22 Concentration. Table B.7 Whole Foods Market Stock Price, Daily Closing Adjusted for Splits. Table B.8 Unemployment Rate - Full-Time Labor Force, Not Seasonally Adjusted. Table B.9 International Sunspot Numbers. Table B.10 United Kingdom Airline Miles Flown. Table B.11 Champagne Sales. Table B.12 Chemical Process Yield, with Operating Temperature (Uncontrolled). Table B.13 U.S. Production of Ice Cream and Frozen Yogurt. Table B.14 Atmospheric CO2 Concentrations at Mauna Loa Observatory. Table B.15 U.S. National Violent Crime Rate. Table B.16 U.S. Gross Domestic Product. Table B.17 U.S. Total Energy Consumption. Table B.18 U.S. Coal Production. Table B.19 Arizona Drowning Rate, Children 1-4 Years Old. Table B.20 U.S. Internal Revenue Tax Refunds. Index.
MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks
The prevalence of networked sensors and actuators in many real-world systems such as smart buildings, factories, power plants, and data centers generate substantial amounts of multivariate time series data for these systems. The rich sensor data can be continuously monitored for intrusion events through anomaly detection. However, conventional threshold-based anomaly detection methods are inadequate due to the dynamic complexities of these systems, while supervised machine learning methods are unable to exploit the large amounts of data due to the lack of labeled data. On the other hand, current unsupervised machine learning approaches have not fully exploited the spatial-temporal correlation and other dependencies amongst the multiple variables (sensors/actuators) in the system for detecting anomalies. In this work, we propose an unsupervised multivariate anomaly detection method based on Generative Adversarial Networks (GANs). Instead of treating each data stream independently, our proposed MAD-GAN framework considers the entire variable set concurrently to capture the latent interactions amongst the variables. We also fully exploit both the generator and discriminator produced by the GAN, using a novel anomaly score called DR-score to detect anomalies by discrimination and reconstruction. We have tested our proposed MAD-GAN using two recent datasets collected from real-world CPS: the Secure Water Treatment (SWaT) and the Water Distribution (WADI) datasets. Our experimental results showed that the proposed MAD-GAN is effective in reporting anomalies caused by various cyber-intrusions compared in these complex real-world systems.
A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data
Nowadays, multivariate time series data are increasingly collected in various real world systems, e.g., power plants, wearable devices, etc. Anomaly detection and diagnosis in multivariate time series refer to identifying abnormal status in certain time steps and pinpointing the root causes. Building such a system, however, is challenging since it not only requires to capture the temporal dependency in each time series, but also need encode the inter-correlations between different pairs of time series. In addition, the system should be robust to noise and provide operators with different levels of anomaly scores based upon the severity of different incidents. Despite the fact that a number of unsupervised anomaly detection algorithms have been developed, few of them can jointly address these challenges. In this paper, we propose a Multi-Scale Convolutional Recurrent Encoder-Decoder (MSCRED), to perform anomaly detection and diagnosis in multivariate time series data. Specifically, MSCRED first constructs multi-scale (resolution) signature matrices to characterize multiple levels of the system statuses in different time steps. Subsequently, given the signature matrices, a convolutional encoder is employed to encode the inter-sensor (time series) correlations and an attention based Convolutional Long-Short Term Memory (ConvLSTM) network is developed to capture the temporal patterns. Finally, based upon the feature maps which encode the inter-sensor correlations and temporal information, a convolutional decoder is used to reconstruct the input signature matrices and the residual signature matrices are further utilized to detect and diagnose anomalies. Extensive empirical studies based on a synthetic dataset and a real power plant dataset demonstrate that MSCRED can outperform state-of-the-art baseline methods.
Multivariate Time Series Imputation with Generative Adversarial Networks
Multivariate time series usually contain a large number of missing values, which hinders the application of advanced analysis methods on multivariate time series data. Conventional approaches to addressing the challenge of missing values, including mean/zero imputation, case deletion, and matrix factorization-based imputation, are all incapable of modeling the temporal dependencies and the nature of complex distribution in multivariate time series. In this paper, we treat the problem of missing value imputation as data generation. Inspired by the success of Generative Adversarial Networks (GAN) in image generation, we propose to learn the overall distribution of a multivariate time series dataset with GAN, which is further used to generate the missing values for each sample. Different from the image data, the time series data are usually incomplete due to the nature of data recording process. A modified Gate Recurrent Unit is employed in GAN to model the temporal irregularity of the incomplete time series. Experiments on two multivariate time series datasets show that the proposed model outperformed the baselines in terms of accuracy of imputation. Experimental results also showed that a simple model on the imputed data can achieve state-of-the-art results on the prediction tasks, demonstrating the benefits of our model in downstream applications.
Temporal pattern attention for multivariate time series forecasting
Forecasting of multivariate time series data, for instance the prediction of electricity consumption, solar power production, and polyphonic piano pieces, has numerous valuable applications. However, complex and non-linear interdependencies between time steps and series complicate this task. To obtain accurate prediction, it is crucial to model long-term dependency in time series data, which can be achieved by recurrent neural networks (RNNs) with an attention mechanism. The typical attention mechanism reviews the information at each previous time step and selects relevant information to help generate the outputs; however, it fails to capture temporal patterns across multiple time steps. In this paper, we propose using a set of filters to extract time-invariant temporal patterns, similar to transforming time series data into its “frequency domain”. Then we propose a novel attention mechanism to select relevant time series, and use its frequency domain information for multivariate forecasting. We apply the proposed model on several real-world tasks and achieve state-of-the-art performance in almost all of cases. Our source code is available at https://github.com/gantheory/TPA-LSTM.
Classification of Multivariate Time Series and Structured Data Using Constructive Induction
We present a method of constructive induction aimed at learning tasks involving multivariate time series data. Using metafeatures, the scope of attribute-value learning is expanded to domains with instances that have some kind of recurring substructure, such as strokes in handwriting recognition, or local maxima in time series data. The types of substructures are defined by the user, but are extracted automatically and are used to construct attributes.Metafeatures are applied to two real domains: sign language recognition and ECG classification. Using metafeatures we are able to generate classifiers that are either comprehensible or accurate, producing results that are comparable to hand-crafted preprocessing and comparable to human experts.
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios.
Imputation of missing data in time series for air pollutants
Missing data are major concerns in epidemiological studies of the health effects of environmental air pollutants. This article presents an imputation-based method that is suitable for multivariate time series data, which uses the EM algorithm under the assumption of normal distribution. Different approaches are considered for filtering the temporal component. A simulation study was performed to assess validity and performance of proposed method in comparison with some frequently used methods. Simulations showed that when the amount of missing data was as low as 5%, the complete data analysis yielded satisfactory results regardless of the generating mechanism of the missing data, whereas the validity began to degenerate when the proportion of missing values exceeded 10%. The proposed imputation method exhibited good accuracy and precision in different settings with respect to the patterns of missing observations. Most of the imputations obtained valid results, even under missing not at random. The methods proposed in this study are implemented as a package called mtsdi for the statistical software system R .
Multivariate Industrial Time Series with Cyber-Attack Simulation: Fault Detection Using an LSTM-based Predictive Data Model
We adopted an approach based on an LSTM neural network to monitor and detect faults in industrial multivariate time series data. To validate the approach we created a Modelica model of part of a real gasoil plant. By introducing hacks into the logic of the Modelica model, we were able to generate both the roots and causes of fault behavior in the plant. Having a self-consistent data set with labeled faults, we used an LSTM architecture with a forecasting error threshold to obtain precision and recall quality metrics. The dependency of the quality metric on the threshold level is considered. An appropriate mechanism such as "one handle" was introduced for filtering faults that are outside of the plant operator field of interest.
Dynamic Representation of Multivariate Time Series Data
In this article we describe a procedure for representing multivariate time series data by means of interactive, computer-generated dynamic imagery with computer-music accompaniment. This innovation conveys the novel insights that dynamic imagery can provide; yet, the imagery is developed from principles that make the representation useful when examined either statically or dynamically. This is because the development of the dynamic representation is guided by the same perceptual and technical principles used in making a motion picture. The particular implementation we describe is evaluated by a formal psychophysics experiment in which we measure the threshold correlation that can be perceived in our dynamic representation, and in each of three different types of graphical portrayals.
Time Series Classification With Multivariate Convolutional Neural Network
Time series classification is an important research topic in machine learning and data mining communities, since time series data exist in many application domains. Recent studies have shown that machine learning algorithms could benefit from good feature representation, explaining why deep learning has achieved breakthrough performance in many tasks. In deep learning, the convolutional neural network (CNN) is one of the most well-known approaches, since it incorporates feature learning and classification task in a unified network architecture. Although CNN has been successfully applied to image and text domains, it is still a challenge to apply CNN to time series data. This paper proposes a tensor scheme along with a novel deep learning architecture called multivariate convolutional neural network (MVCNN) for multivariate time series classification, in which the proposed architecture considers multivariate and lag-feature characteristics. We evaluate our proposed method with the prognostics and health management (PHM) 2015 challenge data, and compare with several algorithms. The experimental results indicate that the proposed method outperforms the other alternatives using the prediction score, which is the evaluation metric used by the PHM Society 2015 data challenge. Besides performance evaluation, we provide detailed analysis about the proposed method.
Multivariate Time Series Classification by Combining Trend-Based and Value-Based Approximations
Multivariate time series data often have a very high dimensionality. Classifying such high dimensional data poses a challenge because a vast number of features can be extracted. Furthermore, the meaning of the normally intuitive term "similar to" needs to be precisely defined. Representing the time series data effectively is an essential task for decision-making activities such as prediction, clustering and classification. In this paper we propose a feature-based classification approach to classify real-world multivariate time series generated by drilling rig sensors in the oil and gas industry. Our approach encompasses two main phases: representation and classification. For the representation phase, we propose a novel representation of time series which combines trend-based and value-based approximations (we abbreviate it as TVA). It produces a compact representation of the time series which consists of symbolic strings that represent the trends and the values of each variable in the series. The TVA representation improves both the accuracy and the running time of the classification process by extracting a set of informative features suitable for common classifiers. For the classification phase, we propose a memory-based classifier which takes into account the antecedent results of the classification process. The inputs of the proposed classifier are the TVA features computed from the current segment, as well as the predicted class of the previous segment. Our experimental results on real-world multivariate time series show that our approach enables highly accurate and fast classification of multivariate time series.
Modeling of multivariate time series using hidden markov models
Vector-valued (or multivariate) time series data commonly occur in various sciences. While modeling univariate time series is well-studied, modeling of multivariate time series, especially finite-valued or categorical, has been relatively unexplored. In this dissertation, we employ hidden Markov models (HMMs) to capture temporal and multivariate dependencies in the multivariate time series data. We modularize the process of building such models by separating the modeling of temporal dependence, multivariate dependence, and non-stationary behavior. We also propose new methods of modeling multivariate dependence for categorical and real-valued data while drawing parallels between these two seemingly different types of data. Since this work is in part motivated by the problem of prediction precipitation over geographic regions from the multiple weather stations, we present in detail models pertinent to this hydrological application and perform a thorough analysis of the models on data collected from a number of different geographic regions.
Anomaly detection for symbolic sequences and time series data
This thesis deals with the problem of anomaly detection for sequence data. Anomaly detection has been a widely researched problem in several application domains such as system health management, intrusion detection, health-care, bio-informatics, fraud detection, and mechanical fault detection. Traditional anomaly detection techniques analyze each data instance (as a univariate or multivariate record) independently, and ignore the sequential aspect of the data. Often, anomalies in sequences can be detected only by analyzing data instances together as a sequence, and hence cannot detected by traditional anomaly detection techniques. The problem of anomaly detection for sequence data is a rich area of research because of two main reasons. First, sequences can be of different types, e.g., symbolic sequences, time series data, etc., and each type of sequence poses unique set of problems. Second, anomalies in sequences can be defined in multiple ways and hence there are different problem formulations. In this thesis we focus on solving one particular problem formulation called semi-supervised anomaly detection. We study the problem separately for symbolic sequences, univariate time series data, and multivariate time series data. The state of art on anomaly detection for sequences is limited and fragmented across application domains. For symbolic sequences, several techniques have been proposed within specific domains, but it is not well-understood as to how a technique developed for one domain would perform in a completely different domain. For univariate time series data, limited techniques exist, and are only evaluated for specific domains, while for multivariate time series data, anomaly detection research is relatively untouched. This thesis has two key goals. First goal is to develop novel anomaly detection techniques for different types of sequences which perform better than existing techniques across a variety of application domains. The second goal is to identify the best anomaly detection technique for a given application domain. By realizing the first goal, we develop a suite of anomaly detection techniques for a domain scientist to choose from, while the second goal will help the scientist to choose the technique best suited for the task. To achieve the first goal, we develop several novel anomaly detection techniques for univariate symbolic sequences, univariate time series data, and multivariate time series data. We provide extensive experimental evaluation of the proposed techniques on data sets collected across diverse domains and generated from data generators, also developed as part of this thesis. We show how the proposed techniques can be used to detect anomalies which translate to critical events in domains such as aircraft safety, intrusion detection, and patient health management. The techniques proposed in this thesis are shown to outperform existing techniques on many data sets. The technique proposed for multivariate time series data is one of the very first anomaly detection technique that can detect complex anomalies in such data. To achieve the second goal, we study the relationship between anomaly detection techniques and the nature of the data on which they are applied. A novel analysis framework, Reference Based Analysis (RBA), is proposed that can map a given data set (of any type) into a multivariate continuous space with respect to a reference data set. We apply the RBA framework to not only visualize and understand complex data types, such as multivariate categorical data and symbolic sequence data, but also to extract data driven features from symbolic sequences, which when used with traditional anomaly detection techniques are shown to consistently outperform the state of art anomaly detection techniques for these complex data types. Two novel techniques for symbolic sequences, WIN1D and WIN 2D are proposed using the RBA framework which perform better than the best technique for each different data set.
neural network sensor network machine learning artificial neural network support vector machine deep learning time series data mining support vector vector machine wavelet transform data analysi deep neural network neural network model hidden markov model regression model deep neural anomaly detection gene expression data base generative adversarial network generative adversarial time series datum adversarial network experimental datum fourier series nearest neighbor support vector regression time series analysi missing datum data based moving average gene expression datum time series model series analysi lyapunov exponent series datum outlier detection dynamic time warping time series forecasting data mining algorithm panel datum time series prediction series model multivariate time series finite time unit root dynamic time linear and nonlinear series forecasting time warping distance measure financial time series series prediction integrated moving average experimental comparison multivariate time financial time dependent variable chaotic time series nonlinear time vegetation index nonlinear time series arima model fuzzy time large time anomaly detection method fuzzy time series chaotic time autoregressive integrated moving time series based air pollutant time series classification representation method fokker-planck equation series representation similarity analysi series classification univariate time series time series clustering unsupervised anomaly detection periodic pattern nearest neighbor classification time series dataset series data mining time series regression anomaly detection approach time series database series clustering observed time series forecasting time series local similarity long time series time series similarity series database fmri time series complex time indian stock market time series representation symbolic aggregate approximation complex time series forecasting time series data set series similarity fmri time time series anomaly large time series series data analysi series anomaly detection analyzing time series expression time series interrupted time series ucr time series time correction modeling time series clustering time series mining time series interrupted time series data based fourier series representation simple exponential smoothing early classification forecast time series time series subsequence sensor networks pose distributed index piecewise constant approximation quality time series mining time microarray time series incomplete time series massive time series large-scale time series analysing time series microarray time neural time series mri time neural time series data generated time series experiment visualizing time series called time series data set