Investigating consumers’ store-choice behavior via hierarchical variable selection

This paper is concerned with a store-choice model for investigating consumers’ store-choice behavior based on scanner panel data. Our store-choice model enables us to evaluate the effects of the consumer/product attributes not only on the consumer’s store choice but also on his/her purchase quantity. Moreover, we adopt a mixed-integer optimization (MIO) approach to selecting the best set of explanatory variables with which to construct the store-choice model. We devise two MIO models for hierarchical variable selection in which the hierarchical structure of product categories is used to enhance the reliability and computational efficiency of the variable selection. We assess the effectiveness of our MIO models through computational experiments on actual scanner panel data. These experiments are focused on the consumer’s choice among three types of stores in Japan: convenience stores, drugstores, and (grocery) supermarkets. The computational results demonstrate that our method has several advantages over the common methods for variable selection, namely, the stepwise method and $$L_1$$L1-regularized regression. Furthermore, our analysis reveals that convenience stores are most strongly chosen for gift cards and garbage disposal permits, drugstores are most strongly chosen for products that are specific to drugstores, and supermarkets are most strongly chosen for health food products by women with families.

[1]  A. Tversky,et al.  "Preference trees": Correction to Tversky and Sattath , 1980 .

[2]  Yadolah Dodge,et al.  Mathematical Programming In Statistics , 1981 .

[3]  D. McFadden The Choice Theory Approach to Market Research , 1986 .

[4]  B. Kahn,et al.  Modeling choice among assortments. , 1991 .

[5]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[6]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[7]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[8]  K. Ruyter,et al.  On the relationship between store image, store satisfaction and store loyalty , 1998 .

[9]  Robert W. Wilson,et al.  Regressions by Leaps and Bounds , 2000, Technometrics.

[10]  Peter T. L. Popkowski Leszczyc,et al.  Experimental choice analysis of shopping strategies , 2001 .

[11]  Glenn B. Voss,et al.  The Influence of Multiple Store Environment Cues on Perceived Merchandise Value and Patronage Intentions , 2002 .

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  Donald Estep,et al.  Piecewise Linear Approximation , 2004 .

[14]  R. Stolzenberg,et al.  Multiple Regression Analysis , 2004 .

[15]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[16]  A. Chernev Decision Focus and Consumer Choice Among Assortments , 2006 .

[17]  G. Zinkhan,et al.  Determinants of retail patronage: A meta-analytical perspective , 2006 .

[18]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[19]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2007 .

[20]  Francis R. Bach,et al.  Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning , 2008, NIPS.

[21]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.

[22]  P. Zhao,et al.  The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[23]  Edward J. Fox,et al.  How Does Assortment Affect Grocery Store Choice? , 2009 .

[24]  Thomas Reutterer,et al.  Store format choice and shopping trip types , 2009 .

[25]  Silvia Casado Yusta,et al.  Different metaheuristic strategies to solve the feature selection problem , 2009, Pattern Recognit. Lett..

[26]  Hiroshi Konno,et al.  Choosing the best set of variables in regression analysis using integer programming , 2009, J. Glob. Optim..

[27]  Junzhou Huang,et al.  Learning with structured sparsity , 2009, ICML '09.

[28]  Eric P. Xing,et al.  Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.

[29]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[30]  Julien Mairal,et al.  Proximal Methods for Hierarchical Sparse Coding , 2010, J. Mach. Learn. Res..

[31]  R. Tibshirani,et al.  A LASSO FOR HIERARCHICAL INTERACTIONS. , 2012, Annals of statistics.

[32]  Richard Weber,et al.  Feature selection for Support Vector Machines via Mixed Integer Linear Programming , 2014, Inf. Sci..

[33]  Cynthia Rudin,et al.  Supersparse linear integer models for optimized medical scoring systems , 2015, Machine Learning.

[34]  Ryuhei Miyashiro,et al.  Subset selection by Mallows' C p , 2015 .

[35]  D. Bertsimas,et al.  Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[36]  Ryuhei Miyashiro,et al.  Mixed integer second-order cone programming formulations for variable selection in linear regression , 2015, Eur. J. Oper. Res..

[37]  Venkat Reddy Konasani,et al.  Multiple Regression Analysis , 2015 .

[38]  Ryuhei Miyashiro,et al.  Subset selection by Mallows' Cp: A mixed integer programming approach , 2015, Expert Syst. Appl..

[39]  Toshiki Sato,et al.  Piecewise-Linear Approximation for Feature Subset Selection in a Sequential Logit Model , 2015, ArXiv.

[40]  Toshiki Sato,et al.  Feature subset selection for logistic regression via mixed integer optimization , 2016, Comput. Optim. Appl..

[41]  Dimitris Bertsimas,et al.  OR Forum - An Algorithmic Approach to Linear Regression , 2016, Oper. Res..

[42]  Takanobu Nakahara,et al.  Using mixed integer optimisation to select variables for a store choice model , 2016, Int. J. Knowl. Eng. Soft Data Paradigms.

[43]  Dimitris Bertsimas,et al.  Logistic Regression: From Art to Science , 2017 .

[44]  Ken Kobayashi,et al.  BEST SUBSET SELECTION FOR ELIMINATING MULTICOLLINEARITY , 2017 .

[45]  Nikolaos V. Sahinidis,et al.  The ALAMO approach to machine learning , 2017, Comput. Chem. Eng..

[46]  M. van Beek An Algorithmic Approach to Linear Regression , 2018 .

[47]  Ken Kobayashi,et al.  Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor , 2018, Journal of Global Optimization.