Mining Massive Fine-Grained Behavior Data to Improve Predictive Analytics

Organizations increasingly have access to massive, fine-grained data on consumer behavior. Despite the hype over "big data," and the success of predictive analytics, only a few organizations have incorporated such finegrained data in a non-aggregated manner into their predictive analytics. This paper examines the use of massive, fine-grained data on consumer behavior—specifically payments to a very large set of particular merchants—to improve predictive models for targeted marketing. The paper details how using this different sort of data can substantially improve predictive performance, even in an application for which predictive analytics has been applied for years. One of the most striking results has important implications for managers considering the value of big data. Using a real-life data set of 21 million transactions by 1.2 million customers, as well as 289 other variables describing these customers, the results show that there is no appreciable improvement from moving to big data when using traditional structured data. However, in contrast, when using fine-grained behavior data, there continues to be substantial value to increasing the data size across the entire range of the analyses. This suggests that larger firms may have substantially more valuable data assets than smaller firms, when using their transaction data for targeted marketing.

[1]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[2]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[3]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[4]  Bart Baesens,et al.  Social network analysis for customer churn prediction , 2014, Appl. Soft Comput..

[5]  Aditya Krishna Menon,et al.  Large-Scale Support Vector Machines: Algorithms and Theory , 2009 .

[6]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[7]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[8]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[9]  Xiaohua Hu,et al.  A Data Mining Approach for Retailing Bank Customer Attrition Analysis , 2004, Applied Intelligence.

[10]  Dirk Van den Poel,et al.  Customer attrition analysis for financial services using proportional hazard models , 2004, Eur. J. Oper. Res..

[11]  Martin G. Everett,et al.  Network analysis of 2-mode data , 1997 .

[12]  Chris Volinsky,et al.  Network-Based Marketing: Identifying Likely Adopters Via Consumer Networks , 2006, math/0606278.

[13]  Foster J. Provost,et al.  Explaining Data-Driven Document Classifications , 2013, MIS Q..

[14]  Bart Baesens,et al.  A total data quality management for credit risk: new insights and challenges , 2012, Int. J. Inf. Qual..

[15]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[16]  Arun Sundararajan,et al.  Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks , 2009, Proceedings of the National Academy of Sciences.

[17]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[18]  Matthieu Latapy,et al.  Basic notions for the analysis of large two-mode networks , 2008, Soc. Networks.

[19]  Foster Provost,et al.  A Simple Relational Classifier , 2003 .

[20]  Tom Fawcett,et al.  Data Science and its Relationship to Big Data and Data-Driven Decision Making , 2013, Big Data.

[21]  Jeffrey S. Simonoff,et al.  Tree Induction Vs Logistic Regression: A Learning Curve Analysis , 2001, J. Mach. Learn. Res..

[22]  Peter S. Fader,et al.  RFM and CLV: Using Iso-Value Curves for Customer Base Analysis , 2005 .

[23]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[24]  R. Breiger The Duality of Persons and Groups , 1974 .

[25]  Foster Provost,et al.  Matrix-Factorization-Based Dimensionality Reduction in the Predictive Modeling Process: A Design Science Perspective , 2016 .

[26]  Amir M. Hormozi,et al.  Data Mining: A Competitive Weapon for Banking and Retail Industries , 2004, Inf. Syst. Manag..

[27]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[28]  Foster J. Provost,et al.  Predictive Modeling With Big Data: Is Bigger Really Better? , 2013, Big Data.