Supervised classification with interdependent variables to support targeted energy efficiency measures in the residential sector

AbstractThis paper presents a supervised classification model, where the indicators of correlation between dependent and independent variables within each class are utilized for a transformation of the large-scale input data to a lower dimension without loss of recognition relevant information. In the case study, we use the consumption data recorded by smart electricity meters of 4200 Irish dwellings along with half-hourly outdoor temperature to derive 12 household properties (such as type of heating, floor area, age of house, number of inhabitants, etc.). Survey data containing characteristics of 3500 households enables algorithm training. The results show that the presented model outperforms ordinary classifiers with regard to the accuracy and temporal characteristics. The model allows incorporating any kind of data affecting energy consumption time series, or in a more general case, the data affecting class-dependent variable, while minimizing the risk of the curse of dimensionality. The gained information on household characteristics renders targeted energy-efficiency measures of utility companies and public bodies possible.

[1]  Hans-Arno Jacobsen,et al.  Household electricity demand forecasting: benchmarking state-of-the-art methods , 2014, e-Energy.

[2]  Philip W. Suckling,et al.  Impact of climatic variability on residential electrical energy consumption in the Eastern United States , 1983 .

[3]  C. V. Jawahar,et al.  Generalized RBF feature maps for Efficient Detection , 2010, BMVC.

[4]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[5]  Zi Y. Zhang,et al.  A study on the electric power load of Beijing and its relationships with meteorological factors during summer and winter , 2014 .

[6]  Thorsten Staake,et al.  Data-Based Assessment of Plug-in Electric Vehicle Driving , 2015, D-A-CH EI.

[7]  N. Hatziargyriou,et al.  An Annual Midterm Energy Forecasting Model Using Fuzzy Logic , 2009, IEEE Transactions on Power Systems.

[8]  Z. Vale,et al.  An electric energy consumer characterization framework based on data mining techniques , 2005, IEEE Transactions on Power Systems.

[9]  William Emmanuel S. Yu,et al.  Aide multicritere a la decision dans le cadre de la problematique du tri , 1992 .

[10]  Constantin Zopounidis,et al.  PREFDIS: a multicriteria decision support system for sorting decision problems , 2000, Comput. Oper. Res..

[11]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[12]  Ignacio Benitez Sanchez,et al.  Clients segmentation according to their domestic energy consumption by the use of self-organizing maps , 2009, 2009 6th International Conference on the European Energy Market.

[13]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[14]  Thorsten Staake,et al.  Using Supervised Machine Learning to Explore Energy Consumption Data in Private Sector Housing , 2015 .

[15]  A. Dobson An introduction to generalized linear models , 1990 .

[16]  John A. Nelder,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[17]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[18]  British Machine Vision Conference, BMVC 2010, Aberystwyth, UK, August 31 - September 3, 2010. Proceedings , 2010, BMVC.

[19]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Basilis Boutsinas,et al.  A method for improving the accuracy of data mining classification algorithms , 2009, Comput. Oper. Res..

[21]  Céline Rouveirol,et al.  Machine Learning: ECML-98 , 1998, Lecture Notes in Computer Science.

[22]  Zhongyi Hu,et al.  Interval Forecasting of Electricity Demand: A Novel Bivariate EMD-based Support Vector Regression Modeling Framework , 2014, ArXiv.

[23]  Bernard Roy,et al.  Aide multicritère à la décision : méthodes et cas , 1993 .

[24]  Thorsten Staake,et al.  Gaining IS Business Value through Big Data Analytics: A Case Study of the Energy Sector , 2015, ICIS.

[25]  Thorsten Staake,et al.  Feature extraction and filtering for household classification based on smart electricity meter data , 2014, Computer Science - Research and Development.

[26]  Raul Sidnei Wazlawick,et al.  Principal Component Analysis to Reduce Forecasting Error of Industrial Energy Consumption in Models Based on Neural Networks , 2014, ICAISC.

[27]  Zhongyi Hu,et al.  Comprehensive learning particle swarm optimization based memetic algorithm for model selection in short-term load forecasting using support vector regression , 2014, Appl. Soft Comput..

[28]  Michel Verleysen,et al.  Unsupervised dimensionality reduction: Overview and recent advances , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[29]  S. R. Searle Linear Models , 1971 .

[30]  S. Johansson,et al.  Interactive Dimensionality Reduction Through User-defined Combinations of Quality Metrics , 2009, IEEE Transactions on Visualization and Computer Graphics.

[31]  Alessandra Bassini,et al.  Relationships between meteorological variables and monthly electricity demand , 2012 .

[32]  Silvia Santini,et al.  Automatic socio-economic classification of households using electricity consumption data , 2013, e-Energy '13.

[33]  Olivia Guerra Santin,et al.  Behavioural Patterns and User Profiles related to energy consumption for heating , 2011 .

[34]  Emilio Carrizosa,et al.  Supervised classification and mathematical optimization , 2013, Comput. Oper. Res..

[35]  M. M. C. Lamont ASSESSING THE INFLUENCE OF OBSERVATIONS ON THE GENERALIZATION PERFORMANCE OF THE KERNEL FISHER DISCRIMINANT CLASSIFIER , 2008 .

[36]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[38]  Charles L. Weber,et al.  Higher-Order Correlation-Based Approach to Modulation Classification of Digitally Frequency-Modulated Signals , 1995, IEEE J. Sel. Areas Commun..

[39]  Andrew Flitman,et al.  Towards analysing student failures: neural networks compared with regression analysis and multiple discriminant analysis , 1997, Comput. Oper. Res..

[40]  Mohammed J. Zaki Data Mining and Analysis: Fundamental Concepts and Algorithms , 2014 .

[41]  Allan D. Shocker,et al.  Linear programming techniques for multidimensional analysis of preferences , 1973 .

[42]  Marie Bessec,et al.  The non-linear link between electricity consumption and temperature in Europe: A threshold panel approach , 2008 .

[43]  Eric R. Ziegel,et al.  Generalized Linear Models:Generalized Linear Models , 2002 .

[44]  Thorsten Staake,et al.  Green IS Design and Energy Conservation: An Empirical Investigation of Social Normative Feedback , 2011, ICIS.

[45]  R. D'Agostino,et al.  Goodness-of-Fit-Techniques , 1987 .

[46]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[47]  C. A. Smith Some examples of discrimination. , 1947, Annals of eugenics.