Optimized data acquisition by time series clustering in OPC

How to optimize OPC Group Management is the most frequently asked question when integrating OPC with a SCADA system. Group management assumes that the OPC client has the information to partition OPC items into homogeneous OPC groups with optimal configuration parameters, such as update rate or deadband. In reality, supervised group management mandates an empirical configuration which often leads to high group polling rate on the server and low item update rate on the client. In this paper we propose an unsupervised OPC group management concept and algorithm by modeling the OPC items as time series functions in order to quantify the similarities. Partitioning items into the optimal OPC groups is achieved using the hierarchical clustering that does not require the number of optimal clusters to be known in advance as oppose to K-mean which often produces suboptimal result and reduce the homogeneity within the group. An evaluation comparison is provided for the unsupervised and supervised method that suggests that our approach produced outstanding performance.

[1]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[2]  Robert S. Atlas,et al.  Comparative evaluation of two superior stopping rules for hierarchical cluster analysis , 1994 .

[3]  Frank Klawonn,et al.  Fuzzy Clustering of Short Time-Series and Unevenly Distributed Sampling Points , 2003, IDA.

[4]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[5]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[6]  Xu Hong,et al.  Using standard components in automation industry: A study on OPC Specification , 2006, Comput. Stand. Interfaces.

[7]  Tommi S. Jaakkola,et al.  A new approach to analyzing gene expression time series data , 2002, RECOMB '02.

[8]  Hui Zhang,et al.  Combining the Global and Partial Information for Distance-Based Time Series Classification and Clustering , 2006, J. Adv. Comput. Intell. Intell. Informatics.

[9]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[10]  Xu Hong,et al.  An extendable data engine based on OPC specification , 2004, Comput. Stand. Interfaces.

[11]  DANA AVRAM LUPŞA,et al.  UNSUPERVISED SINGLE-LINK HIERARCHICAL CLUSTERING , 2005 .

[12]  Pasi Fränti,et al.  Randomised Local Search Algorithm for the Clustering Problem , 2000, Pattern Analysis & Applications.

[13]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[14]  G. W. Milligan,et al.  An examination of procedures for determining the number of clusters in a data set , 1985 .

[15]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[16]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[17]  Luiz Antonio Nogueira Lorena,et al.  Constructive Genetic Algorithm for Clustering Problems , 2001, Evolutionary Computation.

[18]  C. A. Glasbey,et al.  Complete linkage as a multiple stopping rule for single linkage clustering , 1987 .

[19]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[20]  G. N. Lance,et al.  A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems , 1967, Comput. J..

[21]  Yannis Manolopoulos,et al.  Continuous Trend-Based Clustering in Data Streams , 2008, DaWaK.