论文信息 - Time Series Data Mining

Time Series Data Mining

Data Mining or Knowledge Discovery in Databases (KDD) is an important area of computer sciences. The relevance of this area is due to the enormous quantity of information daily produced by different sources, for instance the web, biological processes, finance, the aeronautic industry, retail, and telecommunications data. A considerable amount of this information represents temporal events which are typically stored in the form of time series. There are several phenomena expected to be identified among databases of this type, namely through motif (pattern) discovery, classification, clustering, query by content, abnormality detection, and forecast of property values. We focus particularly on the area of time series motif discovery (Lin and Keogh 2002) , also known as the extraction of recurrent patterns. These patterns are relevant because they summarise the time series of a domain and help the domain expert understand the database at hand (Ferreira et al. 2006). Figure 1 shows one example of such type of pattern in the context of electroencephalogram (EEG) time series. This specific motif is detected in three different time series in the database.

Nuno Constantino Castro | Nuno Castro

[1] Jessica Lin,et al. Finding Motifs in Time Series , 2002, KDD 2002.

[2] Paulo J. Azevedo,et al. Multiresolution Motif Discovery in Time Series , 2010, SDM.

[3] Paulo J. Azevedo,et al. Mining Approximate Motifs in Time Series , 2006, Discovery Science.

[4] Eamonn J. Keogh,et al. iSAX: indexing and mining terabyte sized time series , 2008, KDD.

[5] Paulo J. Azevedo,et al. Evaluating Protein Motif Significance Measures: A Case Study on Prosite Patterns , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[6] Ambuj K. Singh,et al. GraphRank: Statistical Modeling and Mining of Significant Subgraphs in the Feature Space , 2006, Sixth International Conference on Data Mining (ICDM'06).

[7] Mireille Régnier,et al. Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules , 2007, Algorithms for Molecular Biology.

[8] Divyakant Agrawal,et al. Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.