Machine learning and AI in marketing – Connecting computing power to human insights

Abstract Artificial intelligence (AI) agents driven by machine learning algorithms are rapidly transforming the business world, generating heightened interest from researchers. In this paper, we review and call for marketing research to leverage machine learning methods. We provide an overview of common machine learning tasks and methods, and compare them with statistical and econometric methods that marketing researchers traditionally use. We argue that machine learning methods can process large-scale and unstructured data, and have flexible model structures that yield strong predictive performance. Meanwhile, such methods may lack model transparency and interpretability. We discuss salient AI-driven industry trends and practices, and review the still nascent academic marketing literature which uses machine learning methods. More importantly, we present a unified conceptual framework and a multi-faceted research agenda. From five key aspects of empirical marketing research: method, data, usage, issue, and theory, we propose a number of research priorities, including extending machine learning methods and using them as core components in marketing research, using the methods to extract insights from large-scale unstructured, tracking, and network data, using them in transparent fashions for descriptive, causal, and prescriptive analyses, using them to map out customer purchase journeys and develop decision-support capabilities, and connecting the methods to human insights and marketing theories. Opportunities abound for machine learning methods in marketing, and we hope our multi-faceted research agenda will inspire more work in this exciting area.

[1]  Fatema Kawaf,et al.  Capturing digital experience: The method of screencast videography , 2019, International Journal of Research in Marketing.

[2]  Olivier Toubia,et al.  A Semantic Approach for Estimating Consumer Content Preferences from Online Search Queries , 2018, Mark. Sci..

[3]  Roland T. Rust,et al.  The Service Revolution and the Transformation of Marketing Science , 2014, Mark. Sci..

[4]  Eric M. Schwartz,et al.  Dynamic Online Pricing with Incomplete Information Using Multi-Armed Bandit Experiments , 2018, Mark. Sci..

[5]  Roland T. Rust,et al.  The Feeling Economy: Managing in the Next Generation of Artificial Intelligence (AI) , 2019, California Management Review.

[6]  Jon D. McAuliffe,et al.  Variational Inference for Large-Scale Models of Discrete Choice , 2007, 0712.2526.

[7]  Steven T. Berry,et al.  Automobile Prices in Market Equilibrium , 1995 .

[8]  P. K. Kannan,et al.  Digital Marketing: A Framework, Review and Research Agenda , 2017 .

[9]  Mahadev Satyanarayanan,et al.  OpenFace: A general-purpose face recognition library with mobile applications , 2016 .

[10]  Giorgos Zacharia,et al.  Generalized robust conjoint estimation , 2005 .

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Eric T. Bradlow,et al.  Automated Marketing Research Using Online Customer Reviews , 2011 .

[13]  Lan Luo,et al.  Consumer Preference Elicitation of Complex Products Using Fuzzy Support Vector Machine Active Learning , 2016, Mark. Sci..

[14]  Steven T. Berry Estimating Discrete-Choice Models of Product Differentiation , 1994 .

[15]  Michael Trusov,et al.  Crumbs of the Cookie: User Profiling in Customer-Base Analysis and Behavioral Targeting , 2016, Mark. Sci..

[16]  David A. Schweidel,et al.  Listening in on Social Media: A Joint Model of Sentiment and Venue Format Choice , 2014 .

[17]  S. Thompson,et al.  Correcting for regression dilution bias: comparison of methods for a single predictor variable , 2000 .

[18]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[19]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[20]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[21]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[22]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[23]  Michael Braun,et al.  Online Display Advertising: Modeling the Effects of Multiple Creatives and Individual Impression Histories , 2013, Mark. Sci..

[24]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[25]  Thorsten Wiesel,et al.  Device Switching in Online Purchasing: Examining the Strategic Contingencies , 2018, Journal of Marketing.

[26]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[27]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Greg M. Allenby,et al.  Sentence-Based Text Analysis for Customer Reviews , 2016, Mark. Sci..

[29]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[30]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[31]  Baohong Sun,et al.  Cross-Selling the Right Product to the Right Customer at the Right Time , 2011 .

[32]  John R. Hauser,et al.  Active Machine Learning for Consideration Heuristics , 2011, Mark. Sci..

[33]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[34]  R. Rust The future of marketing , 2020, International Journal of Research in Marketing.

[35]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[36]  Guda van Noort,et al.  Seeing the wood for the trees: How machine learning can help firms in identifying relevant electronic word-of-mouth in social media , 2019, International Journal of Research in Marketing.

[37]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[38]  M. Pontil,et al.  A Convex Optimization Approach to Modeling Consumer Heterogeneity in Conjoint Estimation , 2007 .

[39]  Yang Li,et al.  Probabilistic Topic Model for Hybrid Recommender Systems: A Stochastic Variational Bayesian Approach , 2018, Mark. Sci..

[40]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[41]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[42]  Sendhil Mullainathan,et al.  Machine Learning: An Applied Econometric Approach , 2017, Journal of Economic Perspectives.

[43]  M. Keane,et al.  Decision-Making Under Uncertainty: Capturing Dynamic Brand Choice Processes in Turbulent Consumer Goods Markets , 1996 .

[44]  Chinmay Kakatkar,et al.  Marketing analytics using anonymized and fragmented tracking data , 2019, International Journal of Research in Marketing.

[45]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[46]  Matthew Shum,et al.  Random Projection Estimation of Discrete-Choice Models with Large Choice Sets , 2016, Manag. Sci..

[47]  Jacob Goldenberg,et al.  Mine Your Own Business: Market-Structure Surveillance Through Text Mining , 2012, Mark. Sci..

[48]  C. J. Stone,et al.  Consistent Nonparametric Regression , 1977 .

[49]  Baohong Sun,et al.  "ADAPTIVE" LEARNING AND "PROACTIVE" CUSTOMER RELATIONSHIP MANAGEMENT , 2006 .

[50]  T. Evgeniou,et al.  Disjunctions of Conjunctions, Cognitive Simplicity, and Consideration Sets , 2010 .

[51]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[52]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[53]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[54]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[55]  Doug J. Chung The Dynamic Advertising Effect of Collegiate Athletics , 2013, Mark. Sci..

[56]  Xin Wang,et al.  Video mining: Measuring visual information using automatic methods , 2019, International Journal of Research in Marketing.

[57]  Olivier Toubia,et al.  Idea Generation, Creativity, and Prototypicality , 2017, Mark. Sci..

[58]  Mark Heitmann,et al.  Comparing automated text classification methods , 2019, International Journal of Research in Marketing.

[59]  G. Tellis,et al.  Mining Marketing Meaning from Online Chatter: Strategic Brand Analysis of Big Data Using Latent Dirichlet Allocation , 2014 .

[60]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[61]  John R. Hauser,et al.  Website Morphing , 2009, Mark. Sci..

[62]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[63]  Kannan Srinivasan,et al.  Analyzing Bank Overdraft Fees with Big Data , 2018, Mark. Sci..

[64]  Daniel Böger,et al.  Extracting brand information from social networks: Integrating image, text, and social tagging data , 2018, International Journal of Research in Marketing.

[65]  P. K. Kannan,et al.  Attributing Conversions in a Multichannel Online Marketing Environment: An Empirical Model and a Field Experiment , 2014 .

[66]  Anand V. Bodapati Recommendation Systems with Purchase Data , 2008 .

[67]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[68]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[69]  Roland T. Rust,et al.  Artificial Intelligence in Service , 2018 .

[70]  Baohong Sun,et al.  Learning and Acting on Customer Information: A Simulation-Based Demonstration on Service Allocations with Offshore Centers , 2011 .

[71]  Eric T. Bradlow,et al.  Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments , 2016 .

[72]  Michel Wedel,et al.  Adaptive personalization using social networks , 2015, Journal of the Academy of Marketing Science.

[73]  Sheng-De Wang,et al.  Fuzzy support vector machines , 2002, IEEE Trans. Neural Networks.

[74]  Roland T. Rust,et al.  My Mobile Music: An Adaptive Personalization System For Digital Audio Players , 2007 .

[75]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[76]  Oded Netzer,et al.  A Hidden Markov Model of Customer Relationship Dynamics , 2008, Mark. Sci..

[77]  Dennis Fok,et al.  Model-based Purchase Predictions for Large Assortments , 2016, Mark. Sci..

[78]  David J. Curry,et al.  Prediction in Marketing Using the Support Vector Machine , 2005 .

[79]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[80]  Sunder Kekre,et al.  The Squeaky Wheel Gets the Grease - An Empirical Analysis of Customer Voice and Firm Intervention on Twitter , 2015, Mark. Sci..

[81]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[82]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[83]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[84]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[85]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[86]  Irene C. L. Ng,et al.  The Internet-of-Things: Review and research directions , 2017 .