Predictive Customer Data Analytics - The Value of Public Statistical Data and the Geographic Model Transferability

Companies pay high prices for detailed customer information (e.g., income, household type) for gaining insights and conducting targeted marketing campaigns. We argue that companies can utilize predictive analytics artifacts to derive such information from existing customer data in combination with freely available data sources, such as open government data. In this study, we use a machine learning artifact for a specific yet highly relevant case from the utility industry, trained on data of 7,504 energy customers and investigate two important aspects for predictive business analytics: First, we identified the sparsely available open government statistics and found that even that limited amount of open data can increase our artifact’s performance. Second, we applied the predictive models, trained with a regional customer dataset, on households in other geographic regions with acceptable performance loss. The results support the development of systems aiding managerial decision-making, predictive marketing and showcase the value of open data.

[1]  John Bohannon,et al.  Many surveys, about one in five, may contain fraudulent data , 2016 .

[2]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[3]  InduShobha N. Chengalur-Smith,et al.  Collaborative e-Government: impediments and benefits of information-sharing projects in the public sector , 2007, Eur. J. Inf. Syst..

[4]  Markus Helfert,et al.  Exploring the Factors that Influence the Diffusion of Open Data for New Service Development: An Interpretive Case Study , 2015, ECIS.

[5]  Gerhard Satzger,et al.  A Business Intelligence Solution for Assessing Customer Interaction, Cross-Selling, and Customization in a Customer intimacy Context , 2012, ECIS.

[6]  E. Pebesma,et al.  Classes and Methods for Spatial Data , 2015 .

[7]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[8]  Niels Bjørn-Andersen,et al.  The Generative Mechanisms Of Open Government Data , 2013, ECIS.

[9]  J. Eto,et al.  The theory and practice of decoupling utility revenues from sales , 1997 .

[10]  Tina Blegind Jensen,et al.  Reframing Open Big Data , 2013, ECIS.

[11]  Diego Ponte,et al.  Enabling an Open Data Ecosystem , 2015, ECIS.

[12]  Efthimios Tambouris,et al.  A classification scheme for open government data: towards linking decentralised data , 2011, Int. J. Web Eng. Technol..

[13]  I. Vassileva,et al.  The impact of consumers’ feedback preferences on domestic electricity consumption , 2012 .

[14]  John Walton,et al.  Gaining customer knowledge through analytical CRM , 2005, Ind. Manag. Data Syst..

[15]  R. Groves Nonresponse Rates and Nonresponse Bias in Household Surveys , 2006 .

[16]  Yannis Charalabidis,et al.  Benefits, Adoption Barriers and Myths of Open Data and Open Government , 2012, Inf. Syst. Manag..

[17]  Mariya A. Sodenkamp,et al.  Energy Data Analytics for Improved residential Service Quality and Energy Efficiency , 2016, ECIS.

[18]  Jan vom Brocke,et al.  Utilizing big data analytics for information systems research: challenges, promises and guidelines , 2016, Eur. J. Inf. Syst..

[19]  Akemi Takeoka Chatfield,et al.  Sharing Government-Owned Data with the Public: A Cross-Country Analysis of Open Data Practice in the Middle East , 2012, AMCIS.

[20]  Rajeev Sharma,et al.  Transforming Decision-Making Processes Transforming decision-making processes : a research agenda for understanding the impact of business analytics on organizations , 2017 .

[21]  Mark Lycett,et al.  ‘Datafication’: making sense of (big) data in a complex world , 2013, Eur. J. Inf. Syst..

[22]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[23]  Christopher G. Reddick,et al.  Capability Challenges in Transforming Government through Open and Big Data: Tales of Two Cities , 2015, ICIS.

[24]  M. Odell,et al.  The hormone-releasing intrauterine device has benefits over hysterectomy for many women. Press release. , 2009, Acta obstetricia et gynecologica Scandinavica.

[25]  Noble Kuriakose,et al.  Don't Get Duped: Fraud through Duplication in Public Opinion Surveys , 2015 .

[26]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[27]  E. Ovaska,et al.  Towards open data based business: Survey on usage of open data in digital services , 2014 .

[28]  Wolfgang Jank,et al.  A data driven framework for early prediction of customer response to promotions , 2015, AMCIS.

[29]  Jan Gorodkin,et al.  Comparing two K-category assignments by a K-category correlation coefficient , 2004, Comput. Biol. Chem..

[30]  Colin W. Rundel,et al.  Interface to Geometry Engine - Open Source (GEOS) , 2015 .

[31]  Elgar Fleisch,et al.  Contrasting the effects of real-time feedback on resource consumption between single- and multi-person households , 2013 .

[32]  Markus Helfert,et al.  Adoption of Open Government Data for Commercial Service Innovation: an Inductive Case Study on Parking Open Data Services , 2017, AMCIS.

[33]  Adegboyega Ojo,et al.  The Role of Open Data in driving Sustainable Mobility in Nine Smart Cities , 2017, European Conference on Information Systems.

[34]  Roger Bivand,et al.  Bindings for the Geospatial Data Abstraction Library , 2015 .

[35]  Thorsten Staake,et al.  Improving residential energy consumption at large using persuasive systems , 2011, ECIS.

[36]  Tarek Sayed,et al.  Transferability of accident prediction models , 2006 .

[37]  John B. Loomis,et al.  Testing Transferability of Recreation Demand Models Across Regions: A Study of Corps of Engineer Reservoirs , 1995 .

[38]  Carlos Santos,et al.  The Two Sides of the Innovation Coin , 2016, AMCIS.

[39]  H. Van Dyck,et al.  Transferability of Species Distribution Models: a Functional Habitat Approach for Two Regionally Threatened Butterflies , 2007, Conservation biology : the journal of the Society for Conservation Biology.

[40]  Julian D. Olden,et al.  Assessing transferability of ecological models: an underappreciated aspect of statistical validation , 2012 .

[41]  Sujan Sikder,et al.  Spatial transferability of travel forecasting models: a review and synthesis , 2013 .

[42]  Foster J. Provost,et al.  Explaining Data-Driven Document Classifications , 2013, MIS Q..

[43]  Cesare Furlanello,et al.  A Comparison of MCC and CEN Error Measures in Multi-Class Prediction , 2010, PloS one.

[44]  Galit Shmueli,et al.  Predictive Analytics in Information Systems Research , 2010, MIS Q..

[45]  Jerry Everett,et al.  An Investigation of the Transferability of Trip Generation Models , 2009 .

[46]  Ranjit Bose,et al.  Customer relationship management: key components for IT success , 2002, Ind. Manag. Data Syst..

[47]  Silvia Santini,et al.  Automatic socio-economic classification of households using electricity consumption data , 2013, e-Energy '13.

[48]  Thorsten Staake,et al.  Feature extraction and filtering for household classification based on smart electricity meter data , 2014, Computer Science - Research and Development.

[49]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[50]  Corinna Fischer Feedback on household electricity consumption: a tool for saving energy? , 2008 .

[51]  Jae-Nam Lee,et al.  Open Innovation Maturity Model for the Government: An Open System Perspective , 2015, ICIS.

[52]  Niels Bjørn-Andersen,et al.  Generating Value from Open Government Data , 2013, ICIS.

[53]  Maria R. Lee,et al.  Leveraging Big Data and Business Analytics , 2013, IT Prof..

[54]  Thorsten Staake,et al.  Gaining IS Business Value through Big Data Analytics: A Case Study of the Energy Sector , 2015, ICIS.

[55]  Conor Hayes,et al.  Using Linked Data to Build Open, Collaborative Recommender Systems , 2010, AAAI Spring Symposium: Linked Data Meets Artificial Intelligence.

[56]  George Kuk,et al.  The Roles of Agency and Artifacts in Assembling Open Data Complementarities , 2011, ICIS.

[57]  Silvia Santini,et al.  Revealing Household Characteristics from Smart Meter Data , 2014 .

[58]  Jason J. Jung,et al.  Recommendation system based on multilingual entity matching on linked open data , 2014, J. Intell. Fuzzy Syst..

[59]  Bram Steurtewagen,et al.  Predicting Consumer Load Profiles Using Commercial and Open Data , 2016, IEEE Transactions on Power Systems.

[60]  Thorsten Staake,et al.  Smart Meter Data Analytics for Enhanced Energy Efficiency in the Residential Sector , 2017, Wirtschaftsinformatik.

[61]  I. Ayres,et al.  Evidence from Two Large Field Experiments that Peer Comparison Feedback Can Reduce Residential Energy Usage , 2009 .