A Higher Order Mining Approach for the Analysis of Real-World Datasets

In this study, we propose a higher order mining approach that can be used for the analysis of real-world datasets. The approach can be used to monitor and identify the deviating operational behaviour of the studied phenomenon in the absence of prior knowledge about the data. The proposed approach consists of several different data analysis techniques, such as sequential pattern mining, clustering analysis, consensus clustering and the minimum spanning tree (MST). Initially, a clustering analysis is performed on the extracted patterns to model the behavioural modes of the studied phenomenon for a given time interval. The generated clustering models, which correspond to every two consecutive time intervals, can further be assessed to determine changes in the monitored behaviour. In cases in which significant differences are observed, further analysis is performed by integrating the generated models into a consensus clustering and applying an MST to identify deviating behaviours. The validity and potential of the proposed approach is demonstrated on a real-world dataset originating from a network of district heating (DH) substations. The obtained results show that our approach is capable of detecting deviating and sub-optimal behaviours of DH substations.

[1]  Sven Werner,et al.  Fault detection in district heating substations , 2015 .

[2]  Jing Liu,et al.  Fault detection and operation optimization in district heating substations based on data mining techniques , 2017 .

[3]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[4]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[5]  John F. Roddick,et al.  Higher order mining , 2008, SKDD.

[6]  Aristides Gionis,et al.  Clustering aggregation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[7]  Per-Olof Johansson Kallioniemi,et al.  A machine learning approach to fault detection in district heating substations , 2018, Energy Procedia.

[8]  Srinivas Katipamula,et al.  Review Article: Methods for Fault Detection, Diagnostics, and Prognostics for Building Systems—A Review, Part I , 2005 .

[9]  Rolf Isermann,et al.  Supervision, fault-detection and fault-diagnosis methods — An introduction , 1997 .

[10]  Chun-Xia Zhang,et al.  Clustering with Prim's sequential representation of minimum spanning tree , 2014, Appl. Math. Comput..

[11]  José Manuel Benítez,et al.  Fault detection based on time series modeling and multivariate statistical process control , 2018, Chemometrics and Intelligent Laboratory Systems.

[12]  Hicham Janati,et al.  pyts: A Python Package for Time Series Classification , 2020, J. Mach. Learn. Res..

[13]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[14]  F. Hampel A General Qualitative Definition of Robustness , 1971 .

[15]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[16]  Shian-Shyong Tseng,et al.  Two-phase clustering process for outliers detection , 2001, Pattern Recognit. Lett..

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[19]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[20]  Cristiano Hora Fontes,et al.  Pattern recognition in multivariate time series - A case study applied to fault detection in a gas turbine , 2016, Eng. Appl. Artif. Intell..

[21]  Nirvana Meratnia,et al.  Outlier Detection Techniques for Wireless Sensor Networks: A Survey , 2008, IEEE Communications Surveys & Tutorials.

[22]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[23]  Fiorella Lauro,et al.  Fault detection analysis using data mining techniques for a cluster of smart office buildings , 2015, Expert Syst. Appl..

[24]  Djamel Djenouri,et al.  Machine Learning for Smart Building Applications , 2019, ACM Comput. Surv..