A data-driven predictive model of city-scale energy use in buildings

Many cities across the United States have turned to building energy disclosure (or benchmarking) laws to encourage transparency in energy efficiency markets and to support sustainability and carbon reduction plans. In addition to direct peer-to-peer comparisons, the benchmarking data published under these laws have been used as a tool by researchers and policy-makers to study the distribution and determinants of energy use in large buildings. However, these policies only cover a small subset of the building stock in a given city, and thus capture only a fraction of energy use at the urban scale. To overcome this limitation, we develop a predictive model of energy use at the building, district, and city scales using training data from energy disclosure policies and predictors from widely-available property and zoning information. We use statistical models to predict the energy use of 1.1million buildings in New York City using the physical, spatial, and energy use attributes of a subset derived from 23,000 buildings required to report energy use data each year. Linear regression (OLS), random forest, and support vector regression (SVM) algorithms are fit to the city's energy benchmarking data and then used to predict electricity and natural gas use for every property in the city. Model accuracy is assessed and validated at the building level and zip code level using actual consumption data from calendar year 2014. We find the OLS model performs best when generalizing to the City as a whole, and SVM results in the lowest mean absolute error for predicting energy use within the LL84 sample. Our median predicted electric energy use intensity for office buildings is 71.2kbtu/sf and for residential buildings is 31.2kbtu/sf with mean absolute log accuracy ratio of 0.17. Building age is found to be a significant predictor of energy use, with newer buildings (particularly those built since 1991) found to have higher consumption levels than those constructed before 1930. We also find higher electric consumption in office and retail buildings, although the sign is reversed for natural gas. In general, larger buildings use less energy per square foot, while taller buildings with more stories, controlling for floor area, use more energy per square foot. Attached buildings – those with adjacent buildings and a shared party wall – are found to have lower natural gas use intensity. The results demonstrate that electricity consumption can be reliably predicted using actual data from a relatively small subset of buildings, while natural gas use presents a more complicated problem given the bimodal distribution of consumption and infrastructure availability.

[1]  Ellen M. Bassett,et al.  Innovation and Climate Action Planning , 2010 .

[2]  Kevin M. Smith,et al.  Forecasting energy consumption of multi-family residential buildings using support vector regression: Investigating the impact of temporal and spatial monitoring granularity on performance accuracy , 2014 .

[3]  Kelvin K. W. Yau,et al.  Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks , 2007 .

[4]  Gérard Biau,et al.  Analysis of a Random Forests Model , 2010, J. Mach. Learn. Res..

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Constantine Kontokosta New York City Local Law 84 Benchmarking Report, August 2012 (Provided data analysis with David Hsu) , 2012 .

[7]  R. Ewing,et al.  The impact of urban form on U.S. residential energy use , 2008 .

[8]  A. Prasad,et al.  Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction , 2006, Ecosystems.

[9]  Wil L. Kling,et al.  Support Vector Machine in Prediction of Building Energy Demand Using Pseudo Dynamic Approach , 2015, ArXiv.

[10]  Frédéric Magoulès,et al.  A review on the prediction of building energy consumption , 2012 .

[11]  David Hsu,et al.  How much information disclosure of building energy performance is necessary , 2014 .

[12]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[13]  Dejan Mumovic,et al.  Uncertainty and modeling energy consumption: Sensitivity analysis for a city-scale domestic energy model , 2013 .

[14]  Constantine Kontokosta,et al.  Applications of machine learning methods to identifying and predicting building retrofit opportunities , 2016 .

[15]  Vijay Modi,et al.  Spatial distribution of urban building energy consumption by end use , 2012 .

[16]  Rishee K. Jain,et al.  Modeling the determinants of large-scale building water use: Implications for data-driven urban sustainability policy , 2015 .

[17]  Chris Tofallis,et al.  A better measure of relative prediction accuracy for model selection and model estimation , 2014, J. Oper. Res. Soc..

[18]  Constantine Kontokosta,et al.  Greening the Regulatory Landscape: The Spatial and Temporal Diffusion of Green Building Policies in U.S. Cities , 2012 .

[19]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[20]  B. Dong,et al.  Applying support vector machines to predict building energy consumption in tropical region , 2005 .

[21]  S. Dhakal Urban energy use and carbon emissions from cities in China and policy implications , 2009 .

[22]  Keith Baker,et al.  Improving the prediction of UK domestic energy-demand using annual consumption-data , 2008 .

[23]  Wei Wang,et al.  Evaluating the effectiveness of urban energy conservation and GHG mitigation measures: The case of Xiamen city, China , 2010 .

[24]  Clarissa Binkley,et al.  Correlating energy consumption with multi-unit residential building characteristics in the city of Toronto , 2013 .

[25]  Jiejin Cai,et al.  Applying support vector machine to predict hourly cooling load in the building , 2009 .

[26]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[27]  Constantine E. Kontokosta Energy disclosure, market behavior, and the building data ecosystem , 2013, Annals of the New York Academy of Sciences.

[28]  Shem Heiple,et al.  Using building energy simulation and geospatial modeling techniques to determine high resolution building sector energy consumption profiles , 2008 .

[29]  M. Newborough,et al.  Auditing energy use in cities , 2001 .

[30]  Mark Jennings,et al.  A review of urban energy system models: Approaches, challenges and opportunities , 2012 .

[31]  Constantine Kontokosta A Market-Specific Methodology for a Commercial Building Energy Performance Index , 2015 .

[32]  Luis Pérez-Lombard,et al.  A review of benchmarking, rating and labelling concepts within the framework of building energy certification schemes , 2009 .

[33]  Stephen M. Wheeler,et al.  State and Municipal Climate Change Plans: The First Generation , 2008 .

[34]  José M. F. Moura,et al.  Big Data + Big Cities: Graph Signals of Urban Air Pollution [Exploratory SP] , 2014, IEEE Signal Processing Magazine.

[35]  J. C. Powell,et al.  Sustainable cities - modelling urban energy supply and demand , 2005 .

[36]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[37]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .