Two New Prediction-Driven Approaches to Discrete Choice Prediction

The ability to predict consumer choices is essential in understanding the demand structure of products and services. Typical discrete choice models that are targeted at providing an understanding of the behavioral process leading to choice outcomes are developed around two main assumptions: the existence of a utility function that represents the preferences over a choice set and the relatively simple and interpretable functional form for the utility function with respect to attributes of alternatives and decision makers. These assumptions lead to models that can be easily interpreted to provide insights into the effects of individual variables, such as price and promotion, on consumer choices. However, these restrictive assumptions might impede the ability of such theory-driven models to deliver accurate predictions and forecasts. In this article, we develop novel approaches targeted at providing more accurate choice predictions. Specifically, we propose two prediction-driven approaches: pairwise preference learning using classification techniques and ranking function learning using evolutionary computation. We compare our proposed approaches with a multiclass classification approach, as well as a standard discrete choice model. Our empirical results show that the proposed approaches achieved significantly higher choice prediction accuracy.

[1]  Robert J. Meyer,et al.  Empirical Generalizations in the Modeling of Consumer Choice , 1995 .

[2]  Jeffrey S. Simonoff,et al.  Tree Induction Vs Logistic Regression: A Learning Curve Analysis , 2001, J. Mach. Learn. Res..

[3]  Hon-Kwong Lui,et al.  Machine Learning for Direct Marketing Response Models: Bayesian Networks with Evolutionary Programming , 2006, Manag. Sci..

[4]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[5]  Weiguo Fan,et al.  Genetic Programming-Based Discovery of Ranking Functions for Effective Web Search , 2005, J. Manag. Inf. Syst..

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Asim Ansari,et al.  Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis , 2006 .

[8]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[9]  Ian Witten,et al.  Data Mining , 2000 .

[10]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[11]  Michiel C. van Wezel,et al.  Improved customer choice predictions using ensemble methods , 2005, Eur. J. Oper. Res..

[12]  Daniel A. Ackerberg Advertising, Learning, and Consumer Choice in Experience Good Markets: An Empirical Examination , 2003 .

[13]  Sumit Sarkar,et al.  Privacy Protection in Data Mining: A Perturbation Approach for Categorical Data , 2006, Inf. Syst. Res..

[14]  Patrick L. Brockett,et al.  A Comparative Analysis of Neural Networks and Statistical Methods for Predicting Consumer Choice , 1997 .

[15]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[16]  Simon Parsons,et al.  Principles of Data Mining by David J. Hand, Heikki Mannila and Padhraic Smyth, MIT Press, 546 pp., £34.50, ISBN 0-262-08290-X , 2004, The Knowledge Engineering Review.

[17]  Harald Hruschka,et al.  A flexible brand choice model based on neural net methodology A comparison to the linear utility multinomial logit model and its latent class extension , 2002, OR Spectr..

[18]  David A. Hensher,et al.  A comparison of the predictive potential of artificial neural networks and nested logit models for commuter mode choice , 1997 .

[19]  R. Kohli,et al.  Internet Recommendation Systems , 2000 .

[20]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[21]  Michael Y. Hu,et al.  Modeling consumer situational choice of long distance communication with neural networks , 2008, Decis. Support Syst..

[22]  P. Slovic,et al.  Reversals of preference between bids and choices in gambling decisions. , 1971 .

[23]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[24]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[25]  Dongsong Zhang,et al.  Predicting and explaining patronage behavior toward web and traditional stores using neural networks: a comparative analysis with logistic regression , 2006, Decis. Support Syst..

[26]  C. Plott,et al.  Economic Theory of Choice and the Preference Reversal Phenomenon , 1979 .

[27]  Yoshinori Suzuki,et al.  Modeling and Testing the "Two-Step" Decision Process of Travelers in Airport and Airline Choices , 2007 .

[28]  Peter E. Rossi,et al.  Bayesian Statistics and Marketing: Rossi/Bayesian Statistics and Marketing , 2006 .

[29]  Wolfgang Banzhaf,et al.  Genetic Programming: An Introduction , 1997 .

[30]  Ryuichi Kitamura,et al.  Exploration Of Driver Route Choice With Advanced Traveler Information Using Neural Network Concepts , 1993 .

[31]  W. Greene,et al.  Discrete Choice Modeling , 2007 .

[32]  H. Lindman Inconsistent preferences among gambles. , 1971 .

[33]  Peter Nordin,et al.  Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications , 1998 .

[34]  M. Wedel,et al.  Market Segmentation: Conceptual and Methodological Foundations , 1997 .

[35]  Michel Wedel,et al.  Discrete and Continuous Representations of Unobserved Heterogeneity in Choice Modeling , 1999 .

[36]  Greg M. Allenby,et al.  A Choice Model with Conjunctive, Disjunctive, and Compensatory Screening Rules , 2004 .

[37]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[38]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[39]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[40]  Huimin Zhao,et al.  A multi-objective genetic programming approach to developing Pareto optimal decision trees , 2007, Decis. Support Syst..

[41]  Dwight Merunka,et al.  Neural networks and the multinomial logit for brand choice modelling: a hybrid approach , 2000 .

[42]  Huimin Zhao,et al.  Instance weighting versus threshold adjusting for cost-sensitive classification , 2008, Knowledge and Information Systems.

[43]  John M. Rose,et al.  Applied Choice Analysis: List of tables , 2005 .

[44]  C. Manski The structure of random utility models , 1977 .

[45]  Robert Sugden,et al.  OBSERVING VIOLATIONS OF TRANSITIVITY BY EXPERIMENTAL METHODS , 1991 .

[46]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[47]  J. Q. Smith,et al.  1. Bayesian Statistics 4 , 1993 .

[48]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[49]  André de Palma,et al.  Discrete Choice Theory of Product Differentiation , 1995 .

[50]  John M. Rose,et al.  Applied Choice Analysis: A Primer , 2005 .

[51]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[52]  Michael D. Clemes,et al.  Consumer choice prediction : artificial neural networks versus logistic models , 2005 .

[53]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[54]  Michael Y. Hu,et al.  Explaining consumer choice through neural networks: The stacked generalization approach , 2003, Eur. J. Oper. Res..

[55]  Hai Yang,et al.  Exploration of route choice behavior with advanced traveler information using neural network concepts , 1993 .

[56]  Foster J. Provost,et al.  Decision-Centric Active Learning of Binary-Outcome Models , 2007, Inf. Syst. Res..

[57]  Gareth James,et al.  Variance and Bias for General Loss Functions , 2003, Machine Learning.

[58]  John M. Rose,et al.  Applied Choice Analysis: List of tables , 2005 .

[59]  Peter E. Rossi,et al.  Bayesian Statistics and Marketing , 2005 .

[60]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[61]  Denzil G. Fiebig,et al.  The Generalized Multinomial Logit Model: Accounting for Scale and Coefficient Heterogeneity , 2010, Mark. Sci..

[62]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[63]  Peter E. Rossi,et al.  The Value of Purchase History Data in Target Marketing , 1996 .

[64]  Carl F. Mela,et al.  E-Customization , 2003 .

[65]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[66]  David J. Curry,et al.  Prediction in Marketing Using the Support Vector Machine , 2005 .

[67]  K. Train,et al.  Mixed Logit with Repeated Choices: Households' Choices of Appliance Efficiency Level , 1998, Review of Economics and Statistics.

[68]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[69]  Kenneth E. Train,et al.  Discrete Choice Methods with Simulation , 2016 .

[70]  B. Zadrozny Reducing multiclass to binary by coupling probability estimates , 2001, NIPS.

[71]  Mark D. Uncles,et al.  Discrete Choice Analysis: Theory and Application to Travel Demand , 1987 .

[72]  Padmini Srinivasan,et al.  Predicting Web Page Status , 2008, Inf. Syst. Res..