Irregularity analysis in time series data
暂无分享,去创建一个
Government and corporations nowadays collect time series data at lowest possible details such as by locations, parts, products, or even individuals. Most of data cleaning methods assume one known type of irregularity. This paper provide a framework for the situation that there are multiple irregularities hiding in large volumes of cross sectional time series and develops a data mining platform to capture these key irregularities one by one based on their importance. It attempts to automate how a data analyst looking at time series graphs when cleaning the data (but there are too many to look at). Clustering is applied to group time series with similar pattern, and the principal irregular component of the dominated time series group is extracted and adjusted. The platform continues to cluster, extract and adjust the next significant irregular components iteratively. Finally all these significant irregular components are summarized in graphic forms to help analysts to know the data better and faster before any analysis and modeling.
[1] Daniel Peña,et al. A New Statistic for Influence in Linear Regression , 2005, Technometrics.
[2] Donald P. Ballou,et al. Modeling Data and Process Quality in Multi-Input, Multi-Output Information Systems , 1985 .
[3] W. Cleveland. Seasonal and calendar adjustment , 1983 .
[4] D. W. Scott. Outlier Detection and Clustering by Partial Mixture Modeling , 2004 .
[5] Issei Fujishiro,et al. The elements of graphing data , 2005, The Visual Computer.