Identification of representative buildings and building groups in urban datasets using a novel pre-processing, classification, clustering and predictive modelling approach

Abstract The formulation of energy policies for urban building stock frequently requires the evaluation of the energy use of large numbers of buildings. When urban energy modelling is utilised as part of this process, the identification of building groups and associated representative buildings can play a critical role. This paper outlines a novel methodology for identifying building groups and associated representative buildings in urban datasets. The methodology utilizes a combination of building classification, building clustering and predictive modelling. First, multiple urban-scale datasets are collected, and then, classification techniques and clustering algorithms are applied to identify building clusters. Next, representative buildings (medoids) in each cluster are identified. Predictive modelling is used to expand cluster membership in the case where some buildings were excluded from the previous analysis. A number of different clustering algorithms are assessed, including K-means and hierarchical (agglomerative and divisive) and partitioning around medoids. The methodology is applied to a large dataset of mixed-use buildings in the city of Geneva, Switzerland. The results, assessed by nine validation indices, indicate the capacity of the decision support framework to identify clusters and associated representative buildings. Furthermore, post-application of predictive modelling, using a random forest approach, facilitates the incorporation of a larger portion of the building stock within the established clusters with an overall average classification accuracy of 89%. A total of 67 representative buildings were identified in the urban dataset, which consisted of 13614 mixed-use buildings in the city of Geneva.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Andrea Gasparella,et al.  Selection of Representative Buildings through Preliminary Cluster Analysis , 2014 .

[3]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[4]  Awais Ahmad,et al.  Urban planning and building smart cities based on the Internet of Things using Big Data analytics , 2016, Comput. Networks.

[5]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[6]  Denis Dineen,et al.  Bottom-up modelling of energy demand and technical energy savings potential in the Irish residential sector , 2014 .

[7]  Nor Badrul Anuar,et al.  The role of big data in smart city , 2016, Int. J. Inf. Manag..

[8]  C. Reinhart,et al.  Three Methods for Characterizing Building Archetypes in Urban Energy Simulation. A Case Study in Kuwait City , 2015 .

[9]  Massimiliano Manfren,et al.  Paradigm shift in urban energy systems through distributed generation: Methods and models , 2011 .

[10]  Agis M. Papadopoulos,et al.  A typological classification of the Greek residential building stock , 2011 .

[11]  F. Stazi,et al.  Estimating energy savings for the residential building stock of an entire city: A GIS-based statistical downscaling approach applied to Rotterdam , 2014 .

[12]  Ardeshir Mahdavi,et al.  Reductive bottom-up urban energy computing supported by multivariate cluster analysis , 2017 .

[13]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[14]  M. Perino,et al.  Energy demand profile generation with detailed time resolution at an urban district scale: A reference building approach and case study , 2017 .

[15]  S. Corgnati,et al.  Use of reference buildings to assess the energy saving potentials of the residential building stock: the experience of TABULA Project , 2014 .

[16]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Christoph F. Reinhart,et al.  Validation of a Bayesian-based method for defining residential archetypes in urban building energy models , 2017 .

[18]  Christoph F. Reinhart,et al.  Urban building energy modeling – A review of a nascent field , 2015 .

[19]  Filip Johnsson,et al.  A modelling strategy for energy, carbon, and cost assessments of building stocks , 2013 .

[20]  Olufemi A. Omitaomu,et al.  A distributed decision framework for building clusters with different heterogeneity settings , 2016 .

[21]  Enedir Ghisi,et al.  Method for obtaining reference buildings , 2016 .

[22]  Daniela M. Witten,et al.  An Introduction to Statistical Learning: with Applications in R , 2013 .

[23]  Michalis Vazirgiannis,et al.  On Clustering Validation Techniques , 2001, Journal of Intelligent Information Systems.

[24]  Milind R. Naphade,et al.  The dubuque electricity portal: evaluation of a city-scale residential electricity consumption feedback system , 2013, CHI.

[25]  David Hsu,et al.  Comparison of integrated clustering methods for accurate and stable prediction of building energy consumption data , 2015 .

[26]  Georgios K. Ouzounis,et al.  Smart cities of the future , 2012, The European Physical Journal Special Topics.

[27]  G. Mihalakakou,et al.  Using principal component and cluster analysis in the heating evaluation of the school building sector , 2010 .

[28]  F. Creutzig,et al.  Global typology of urban energy use and potentials for an urbanization mitigation wedge , 2015, Proceedings of the National Academy of Sciences.

[29]  Ron Shamir,et al.  Clustering Gene Expression Patterns , 1999, J. Comput. Biol..

[30]  Paul Strachan,et al.  Developing archetypes for domestic dwellings: An Irish case study , 2012 .

[31]  Olivia Guerra Santin,et al.  Behavioural Patterns and User Profiles related to energy consumption for heating , 2011 .

[32]  Markus Johansson,et al.  Data-driven method for providing feedback to households on electricity consumption , 2014, 2014 IEEE Ninth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP).

[33]  Allou Samé,et al.  Identifying Daily Electric Consumption Patterns from Smart Meter Data by Means of Clustering Algorithms , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[34]  Constantinos A. Balaras,et al.  Data collection and analysis of the building stock and its energy performance—An example for Hellenic buildings , 2010 .

[35]  Juha Jokisalo,et al.  Calculation method and tool for assessing energy consumption in the building stock , 2014 .

[36]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[37]  Yu Liu,et al.  A hierarchical classification algorithm for evaluating energy consumption behaviors , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[38]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[39]  David Hsu,et al.  Characterizing Energy Use in New York City Commercial and Multifamily Buildings , 2012 .

[40]  M. N. Assimakopoulos,et al.  Using intelligent clustering techniques to classify the energy performance of school buildings , 2007 .

[41]  R. Sakia The Box-Cox transformation technique: a review , 1992 .

[42]  A. Rasheed,et al.  CITYSIM: Comprehensive Micro-Simulation of Resource Flows for Sustainable Urban Planning , 2009 .

[43]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[44]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[45]  Yusuke Kishita,et al.  Determinant factors of residential consumption and perception of energy conservation: Time-series analysis by large-scale questionnaire in Suita, Japan , 2015 .

[46]  R. Kitchin,et al.  The real-time city? Big data and smart urbanism , 2013, GeoJournal.

[47]  Maria Kolokotroni,et al.  A GIS-based bottom-up space heating demand model of the London domestic stock , 2009 .

[48]  Y. Shimoda,et al.  DEVELOPMENT OF RESIDENTIAL ENERGY END-USE SIMULATION MODEL AT CITY SCALE , 2003 .

[49]  Peter W. Newton,et al.  Pathways to decarbonizing the housing sector: a scenario analysis , 2011 .

[50]  Andrea Gasparella,et al.  Energy audit of schools by means of cluster analysis , 2015 .

[51]  Shanlin Yang,et al.  Big data driven smart energy management: From big data to big insights , 2016 .

[52]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[53]  Sohini Roy Chowdhury,et al.  Prediction of electric power consumption for commercial buildings , 2011, The 2011 International Joint Conference on Neural Networks.

[54]  F. Descamps,et al.  A method for the identification and modelling of realistic domestic occupancy sequences for building energy demand simulations and peer comparison , 2014 .

[55]  René M.J. Benders,et al.  New approaches for household energy conservation--In search of personal household energy budgets and energy reduction options , 2006 .

[56]  D. R. Cutler,et al.  Utah State University From the SelectedWorks of , 2017 .

[57]  Pierluigi Mancarella,et al.  Distributed multi-generation: A comprehensive view , 2009 .

[58]  G. Chicco,et al.  Comparisons among clustering techniques for electricity customer classification , 2006, IEEE Transactions on Power Systems.

[59]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[60]  Ali M. Malkawi,et al.  A new methodology for building energy performance benchmarking: An approach based on intelligent clustering algorithm , 2014 .

[61]  Shanlin Yang,et al.  Understanding household energy consumption behavior: The contribution of energy big data analytics , 2016 .

[62]  Simone Ferrari,et al.  A supporting method for defining energy strategies in the building sector at urban scale , 2013 .

[63]  Fabian Levihn,et al.  Big meter data analysis of the energy efficiency potential in Stockholm's building stock , 2014 .

[64]  Nadali Mahmoudi,et al.  Queensland load profiling by using clustering techniques , 2014, 2014 Australasian Universities Power Engineering Conference (AUPEC).

[65]  Giuliano Dall'O',et al.  A methodology for the energy performance classification of residential building stock on an urban scale , 2012 .

[66]  Arno Schlueter,et al.  Integrated model for characterization of spatiotemporal building energy consumption patterns in neighborhoods and city districts , 2015 .

[67]  Benjamin C. M. Fung,et al.  A systematic procedure to study the influence of occupant behavior on building energy consumption , 2011 .

[68]  George K. Karagiannidis,et al.  Big Data Analytics for Dynamic Energy Management in Smart Grids , 2015, Big Data Res..