A systematic feature selection procedure for short-term data-driven building energy forecasting model development

Abstract An accurate building energy forecasting model is the key for real-time model based control of building energy systems and building-grid integration. Data-driven models, though have lower engineering cost during their development process, often suffer from poor model generalization caused by high data dimensionality. Feature selection, a process of selecting a subset of relevant features, can defy high dimensionality, increase model interpretability, and enhance model generalization. In building energy modeling research, features are often selected based on domain knowledge. There lacks a comprehensive methodology to guide a systematic feature selection procedure when developing building energy forecasting models. In this research, a systematic feature selection procedure for developing a building energy forecasting model is proposed which attempts to integrate statistical analysis, building physics and engineering experiences. The proposed procedure includes three steps, i.e., (Step 1) feature pre-processing based on domain knowledge, (Step 2) feature removal through filter methods to remove irrelevant and redundant variables, and (Step 3) feature grouping through wrapper method to search for the best feature set. Two case studies are presented here using both simulated and real building data. The simulated building data are generated from a medium-size office building (a DOE reference building) simulation model. The real building data are obtained from a medium-size campus building in Philadelphia, PA. In both cases, the energy forecasting models that are developed using proposed systematic feature selection procedure is compared with models using other feature selection techniques. Results show that the models developed using proposed procedure have better accuracy and generalization.

[1]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[2]  Jin Yang,et al.  On-line building energy prediction using adaptive artificial neural networks , 2005 .

[3]  Min-Yuan Cheng,et al.  Accurately predicting building energy performance using evolutionary multivariate adaptive regression splines , 2014, Appl. Soft Comput..

[4]  J. Evans Straightforward Statistics for the Behavioral Sciences , 1995 .

[5]  Jean-Louis Scartezzini,et al.  A simplified correlation method accounting for heating and cooling loads in energy-efficient buildings , 1998 .

[6]  Frédéric Magoulès,et al.  Feature Selection for Predicting Building Energy Consumption Based on Statistical Learning Method , 2012 .

[7]  Daniel E. Fisher,et al.  EnergyPlus: creating a new-generation building energy simulation program , 2001 .

[8]  B. Dong,et al.  Applying support vector machines to predict building energy consumption in tropical region , 2005 .

[9]  Tin-Tai Chow,et al.  The use of occupancy space electrical power demand in building cooling load prediction , 2012 .

[10]  Andrew Kusiak,et al.  A data-driven approach for steam load prediction in buildings , 2010 .

[11]  Jiejin Cai,et al.  Applying support vector machine to predict hourly cooling load in the building , 2009 .

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  Xiwang Li,et al.  Net-zero Building Cluster Simulations and On-line Energy Forecasting for Adaptive and Real-Time Control and Decisions , 2015 .

[14]  V. Geros,et al.  Modeling and predicting building's energy use with artificial neural networks: Methods and results , 2006 .

[15]  Tony N.T. Lam,et al.  An analysis of future building energy use in subtropical Hong Kong , 2010 .

[16]  Xiwang Li,et al.  Building energy consumption on-line forecasting using physics based system identification , 2014 .

[17]  Eleni Mangina,et al.  Input variable selection for thermal load predictive models of commercial buildings , 2017 .

[18]  Joe Hagerman,et al.  Buildings-to-Grid Technical Opportunities: Introduction and Vision , 2014 .

[19]  Juan D. Gomez,et al.  Predicting future monthly residential energy consumption using building characteristics and climate data: A statistical learning approach , 2016 .

[20]  Chandrika Kamath,et al.  Feature selection in scientific applications , 2004, KDD.

[21]  Moncef Krarti,et al.  Energy Audit of Building Systems : An Engineering Approach , 2000 .

[22]  Shengwei Wang,et al.  Development of prediction models for next-day building energy consumption and peak power demand using data mining techniques , 2014 .

[23]  Manfred Morari,et al.  Use of model predictive control and weather forecasts for energy efficient building climate control , 2012 .

[24]  Gregor P. Henze,et al.  Statistical Analysis of Neural Networks as Applied to Building Energy Prediction , 2004 .

[25]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Anastasios I. Dounis,et al.  Advanced control systems engineering for energy and comfort management in a building environment--A review , 2009 .

[27]  J. Friedman Multivariate adaptive regression splines , 1990 .

[28]  James E. Braun,et al.  Reducing energy costs and peak electrical demand through optimal control of building thermal storage , 1990 .

[29]  Eric Wai Ming Lee,et al.  A study of the importance of occupancy to building cooling load in prediction by intelligent approach , 2011 .

[30]  Jin Wen,et al.  Review of building energy modeling for control and operation , 2014 .

[31]  Haleh Vafaie,et al.  Feature Selection Methods: Genetic Algorithms vs. Greedy-like Search , 2009 .

[32]  Gints Jekabsons,et al.  Adaptive Regression Splines toolbox for Matlab/Octave , 2015 .