论文信息 - A Time Series Distance Measure for Efficient Clustering of Input/Output Signals by Their Underlying Dynamics

A Time Series Distance Measure for Efficient Clustering of Input/Output Signals by Their Underlying Dynamics

Starting from a dataset with input/output time series generated by multiple deterministic linear dynamical systems, this letter tackles the problem of automatically clustering these time series. We propose an extension to the so-called Martin cepstral distance, that allows to efficiently cluster these time series, and apply it to simulated electrical circuits data. Traditionally, two ways of handling the problem are used. The first class of methods employs a distance measure on time series (e.g., Euclidean, dynamic time warping) and a clustering technique (e.g., <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-means, <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-medoids, and hierarchical clustering) to find natural groups in the dataset. It is, however, often not clear whether these distance measures effectively take into account the specific temporal correlations in these time series. The second class of methods uses the input/output data to identify a dynamic system using an identification scheme, and then applies a model norm-based distance (e.g., <inline-formula> <tex-math notation="LaTeX">$H_{2}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$H_\infty $ </tex-math></inline-formula>) to find out which systems are similar. This, however, can be very time consuming for large amounts of long time series data. We show that the new distance measure presented in this letter performs as good as when every input/output pair is modeled explicitly, but remains computationally much less complex. The complexity of calculating this distance between two time series of length <inline-formula> <tex-math notation="LaTeX">$N$ </tex-math></inline-formula> is <inline-formula> <tex-math notation="LaTeX">$\mathcal {O}(N\log {N})$ </tex-math></inline-formula>.

Bart De Moor | Oliver Lauwers | B. De Moor | Oliver Lauwers

[1] Donald B. Percival,et al. Spectral Analysis for Physical Applications , 1993 .

[2] Eamonn Keogh. Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[3] P. Welch. The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms , 1967 .

[4] William M. Rand,et al. Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[5] Bart De Moor,et al. Subspace angles between ARMA models , 2002, Syst. Control. Lett..

[6] T. Warren Liao,et al. Clustering of time series data - a survey , 2005, Pattern Recognit..

[7] Stephen A. Dyer,et al. Digital signal processing , 2018, 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004..

[8] José M. Molina López,et al. Anomaly Detection Based on Sensor Data in Petroleum Industry Applications , 2015, Sensors.

[9] Richard J. Martin. A metric for ARMA processes , 2000, IEEE Trans. Signal Process..

[10] Eamonn J. Keogh,et al. Everything you know about Dynamic Time Warping is Wrong , 2004 .

[11] Dimitrios Tzovaras,et al. Robust malfunction diagnosis in process industry time series , 2016, 2016 IEEE 14th International Conference on Industrial Informatics (INDIN).

[12] Peter D. Welch,et al. The Fast Fourier Transform and Its Applications , 1969 .

[13] M. Cugmas,et al. On comparing partitions , 2015 .