Comparison of integrated clustering methods for accurate and stable prediction of building energy consumption data

Clustering methods are often used to model energy consumption for two reasons. First, clustering is often used to process data and to improve the predictive accuracy of subsequent energy models. Second, stable clusters that are reproducible with respect to non-essential changes can be used to group, target, and interpret observed subjects. However, it is well known that clustering methods are highly sensitive to the choice of algorithms and variables. This can lead to misleading assessments of predictive accuracy and mis-interpretation of clusters in policymaking.

[1]  Agis M. Papadopoulos,et al.  A typological classification of the Greek residential building stock , 2011 .

[2]  Adrian E. Raftery,et al.  MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering † , 2007 .

[3]  Sunil J Rao,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .

[4]  Helmuth Späth,et al.  Algorithm 39 Clusterwise linear regression , 1979, Computing.

[5]  M. Brusco,et al.  Selection of Variables in Cluster Analysis: An Empirical Comparison of Eight Procedures , 2008 .

[6]  Michael Conlon,et al.  A clustering approach to domestic electricity load profile characterisation using smart metering data , 2015 .

[7]  Supachart Chungpaibulpatana,et al.  Assessment of potential energy saving using cluster analysis: A case study of lighting systems in buildings , 2012 .

[8]  W. DeSarbo,et al.  A maximum likelihood methodology for clusterwise linear regression , 1988 .

[9]  Ersan Kabalci Development of a feasibility prediction tool for solar power plant installation analyses , 2011 .

[10]  Antonio J. Conejo,et al.  Correlated wind-power production and electric load scenarios for investment decisions , 2013 .

[11]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[12]  Nicholas C. Coops,et al.  Predicting building ages from LiDAR data with random forests for building energy modeling , 2014 .

[13]  H. Späth Mathematical algorithms for linear regression , 1991 .

[14]  Benjamin C. M. Fung,et al.  A systematic procedure to study the influence of occupant behavior on building energy consumption , 2011 .

[15]  Thomas Olofsson,et al.  Building energy parameter investigations based on multivariate analysis , 2009 .

[16]  Adrian E. Raftery,et al.  mclust Version 4 for R : Normal Mixture Modeling for Model-Based Clustering , Classification , and Density Estimation , 2012 .

[17]  D. ürge-Vorsatz,et al.  Potentials and costs of carbon dioxide mitigation in the world's buildings , 2008 .

[18]  Gavin L. Fox,et al.  Cautionary Remarks on the Use of Clusterwise Regression , 2008, Multivariate behavioral research.

[19]  Furong Li,et al.  A novel time-of-use tariff design based on Gaussian Mixture Model , 2016 .

[20]  G. Mihalakakou,et al.  Using principal component and cluster analysis in the heating evaluation of the school building sector , 2010 .

[21]  Helmuth Späth,et al.  A fast algorithm for clusterwise linear regression , 1982, Computing.

[22]  Olivia Guerra Santin,et al.  Behavioural Patterns and User Profiles related to energy consumption for heating , 2011 .

[23]  F. Leisch,et al.  FlexMix Version 2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters , 2008 .

[24]  Mikko Kolehmainen,et al.  Data-based method for creating electricity use load profiles using large amount of customer-specific hourly measured electricity use data , 2010 .

[25]  Joseph C. Lam,et al.  An analysis of climatic influences on chiller plant electricity consumption , 2009 .

[26]  Christian Hennig,et al.  Cluster-wise assessment of cluster stability , 2007, Comput. Stat. Data Anal..

[27]  Keith Baker,et al.  Improving the prediction of UK domestic energy-demand using annual consumption-data , 2008 .

[28]  John E. Seem,et al.  Pattern recognition algorithm for determining days of the week with similar energy consumption profiles , 2005 .

[29]  M. N. Assimakopoulos,et al.  Using intelligent clustering techniques to classify the energy performance of school buildings , 2007 .

[30]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[31]  Benjamin C. M. Fung,et al.  A decision tree method for building energy demand modeling , 2010 .

[32]  David J. Spiegelhalter,et al.  Handling uncertainty in housing stock models , 2012 .

[33]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[34]  C. Hennig,et al.  Dissolution point and isolation robustness: Robustness criteria for general cluster analysis methods , 2008 .

[35]  Michael E. Webber,et al.  Clustering analysis of residential electricity demand profiles , 2014 .

[36]  F. Leisch FlexMix: A general framework for finite mixture models and latent class regression in R , 2004 .

[37]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[38]  Kurt Hornik,et al.  A CLUE for CLUster Ensembles , 2005 .

[39]  S. Salat Energy loads, CO2 emissions and building stocks: morphologies, typologies, energy systems and behaviour , 2009 .

[40]  Paul Strachan,et al.  Developing archetypes for domestic dwellings: An Irish case study , 2012 .

[41]  Sebastian Kiluk Algorithmic acquisition of diagnostic patterns in district heating billing system , 2012 .

[42]  V. Ismet Ugursal,et al.  Modeling of end-use energy consumption in the residential sector: A review of modeling techniques , 2009 .

[43]  Luis Pérez-Lombard,et al.  A review on buildings energy consumption information , 2008 .

[44]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[45]  Mikko Kolehmainen,et al.  Reducing energy consumption by using self-organizing maps to create more personalized electricity use information , 2008 .

[46]  Jose Manuel Cejudo-Lopez,et al.  Selection of typical demand days for CHP optimization , 2011 .

[47]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[48]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[49]  Isabelle Guyon,et al.  Clustering: Science or Art? , 2009, ICML Unsupervised and Transfer Learning.

[50]  F. W. Yu,et al.  Using cluster and multivariate analyses to appraise the operating performance of a chiller system serving an institutional building , 2012 .

[51]  David Hsu Identifying key variables and interactions in statistical models of building energy consumption using regularization , 2015 .