Outliers detection method using clustering in buildings data

To achieve energy efficiency in buildings, a lot of raw data is recorded, during the operation of buildings. This recorded raw data is further used for the analysis of the performance of buildings and its different components e.g. Heating, Ventilation and Air-Conditioning (HVAC). To save time and energy it is required to ensure resilience of the data by detecting and replacing outliers (i.e. data samples that are not plausible) in the data before detailed analysis. This paper discusses the steps involved for detecting outliers in the data obtained from absorption chiller using their On/Off state information. It also proposes a method for automatic detection of On/Off and/or Missing Data status of the chiller. The technique uses two layer K-Means clustering for detecting On/Off as well as Missing Data state of the chiller. After automatic detection of the chiller On/Off cycle, a method for outlier detection is proposed using Z-Score normalization based on the On/Off cycle state of chillers and clustering outliers by Expectation Maximization clustering algorithm. Moreover, the results of filling the missing values with regression and linear interpolation for short and long periods are elaborated. All proposed methods are applied to real building data and the results are discussed.

[1]  Mete Celik,et al.  Anomaly detection in temperature data using DBSCAN algorithm , 2011, 2011 International Symposium on Innovations in Intelligent Systems and Applications.

[2]  Xin Jin,et al.  Expectation Maximization Clustering , 2010, Encyclopedia of Machine Learning.

[3]  Cem Iyigun,et al.  Comparison of missing value imputation methods in time series: the case of Turkish meteorological data , 2013, Theoretical and Applied Climatology.

[4]  Padhraic Smyth,et al.  Model selection for probabilistic clustering using cross-validated likelihood , 2000, Stat. Comput..

[5]  Jasmine A. Malinao,et al.  Improving energy efficiency of buildings using data mining technologies , 2014, 2014 IEEE 23rd International Symposium on Industrial Electronics (ISIE).

[6]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[7]  Z. Vale,et al.  An electric energy consumer characterization framework based on data mining techniques , 2005, IEEE Transactions on Power Systems.

[8]  Luís Torgo,et al.  Search-Based Class Discretization , 1997, ECML.

[9]  Shian-Shyong Tseng,et al.  Two-phase clustering process for outliers detection , 2001, Pattern Recognit. Lett..

[10]  Gustaf Olsson,et al.  Instrumentation, Control and Automation in Wastewater Systems , 2015 .

[11]  Zoran Kapelan,et al.  Improved real-time data anomaly detection using context classification , 2011 .

[12]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[13]  Srinivas Katipamula,et al.  Review Article: Methods for Fault Detection, Diagnostics, and Prognostics for Building Systems—A Review, Part I , 2005 .

[14]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[15]  Antonio Morán,et al.  Power monitoring system for university buildings: Architecture and advanced analysis tools , 2013 .

[16]  M Mourad,et al.  A method for automatic validation of long time series of data in urban hydrology. , 2002, Water science and technology : a journal of the International Association on Water Pollution Research.

[17]  Sukumar Nandi,et al.  An Outlier Detection Method Based on Clustering , 2011, 2011 Second International Conference on Emerging Applications of Information Technology.

[18]  Michael R. Brambley,et al.  Review Article: Methods for Fault Detection, Diagnostics, and Prognostics for Building Systems—A Review, Part II , 2005 .

[19]  Vojislav Novakovic,et al.  Identifying important variables of energy use in low energy office building by using multivariate analysis , 2012 .

[20]  Eibe Frank,et al.  Conditional Density Estimation with Class Probability Estimators , 2009, ACML.

[21]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[22]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.