A data mining algorithm for automated characterisation of fluctuations in multichannel timeseries

Abstract We present a data mining technique for the analysis of multichannel oscillatory timeseries data and show an application using poloidal arrays of magnetic sensors installed in the H-1 heliac. The procedure is highly automated, and scales well to large datasets. The timeseries data is split into short time segments to provide time resolution, and each segment is represented by a singular value decomposition (SVD). By comparing power spectra of the temporal singular vectors, related singular values are grouped into subsets which define fluctuation structures . Thresholds for the normalised energy of the fluctuation structure and the normalised entropy of the SVD can be used to filter the dataset. We assume that distinct classes of fluctuations are localised in the space of phase differences Δ ψ ( n , n + 1 ) between each pair of nearest neighbour channels. An expectation maximisation clustering algorithm is used to locate the distinct classes of fluctuations and assign mode numbers where possible, and a cluster tree mapping is used to visualise the results.

[1]  T. Dudok de Wit,et al.  The biorthogonal decomposition as a tool for investigating fluctuations in plasmas , 1994 .

[2]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[3]  K. Wong,et al.  A review of Alfvén eigenmode observations in toroidal plasmas , 1999 .

[4]  J. Howard,et al.  Fluctuations and stability of plasmas in the H-1NF heliac , 2004 .

[5]  Boyd Blackwell,et al.  H-1 design and construction , 1990 .

[6]  V. Krivenski,et al.  TJ-II Project: A Flexible Heliac Stellarator , 1990 .

[7]  Ian Witten,et al.  Data Mining , 2000 .

[8]  Yuji Nakamura,et al.  Goals and status of Heliotron J , 2000 .

[9]  Ding-Zhu Du,et al.  A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering , 2003, J. Glob. Optim..

[10]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[11]  M. Shoji,et al.  An Overview of the Large Helical Device Project , 1998 .

[12]  B. Blackwell Results from helical axis stellarators , 2001 .

[13]  M. Rosenbluth,et al.  Chapter 5: Physics of energetic ions , 1999 .

[14]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[15]  K. Ikeda Progress in the ITER Physics Basis , 2007 .

[16]  Vladimir Cherkassky,et al.  Learning from Data: Concepts, Theory, and Methods , 1998 .

[17]  Anthony B. Murphy,et al.  Initial operation of the Wendelstein 7AS advanced stellarator , 1989 .

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  R. Lathe Phd by thesis , 1988, Nature.

[20]  Donald A. Spong,et al.  Shear Alfvén continua in stellarators , 2003 .

[21]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..