Modeling large data sets in marketing

In the last two decades, marketing databases have grown significantly in terms of size and richness of available information. The analysis of these databases raises several information-related and statistical issues. We aim at providing an overview of a selection of issues related to the analysis of large data sets. We focus on the two important areas: single source databases and customer transaction databases. We discuss models that have been used to describe customer behavior in these fields. Among the issues discussed are the development of parsimonious models, estimation methods, aggregation of data, data-fusion and the optimization of customer-level profit functions. We conclude that problems related to the analysis of large databases are far from resolved, and will stimulate new research avenues in the near future.

[1]  Naufel J. Vilcassim,et al.  Investigating Household Purchase Timing Decisions: A Conditional Hazard Function Approach , 1991 .

[2]  小郷 直言,et al.  John Sonquist;Multivate Model Building:The Validation of A Search Strategy,Ann Arbor,1970 , 1975 .

[3]  J. Eliashberg,et al.  Handbooks in Operations Research and Management Science: Marketing, Volume 5 , 1994 .

[4]  Ronald A. Thisted,et al.  Elements of statistical computing , 1986 .

[5]  M. Keane,et al.  Decision-Making Under Uncertainty: Capturing Dynamic Brand Choice Processes in Turbulent Consumer Goods Markets , 1996 .

[6]  Susana V. Mondschein,et al.  Mailing Decisions in the Catalog Sales Industry , 1996 .

[7]  Bernard Grofman,et al.  Multivariate Model Building: The Validation of a Search Strategy. By John A. Sohnquist. (Ann Arbor: Institute for Social Research, University of Michigan, 1970. Pp. 244. $5.00.) , 1974, American Political Science Review.

[8]  Naufel J. Vilcassim,et al.  Modeling Purchase-Timing and Brand-Switching Behavior Incorporating Explanatory Variables and Unobserved Heterogeneity , 1991 .

[9]  Michel Wedel,et al.  Implications for Asymmetry, Nonproportionality, and Heterogeneity in Brand Switching from Piece-wise Exponential Mixture Hazard Models , 1995 .

[10]  W. Kamakura,et al.  Modeling Preference and Structural Heterogeneity in Consumer Choice , 1996 .

[11]  Sunil Gupta,et al.  Stochastic Models of Interpurchase Time with Time-Dependent Covariates , 1991 .

[12]  Füsun F. Gönül,et al.  Modeling Multiple Sources of Heterogeneity in Multinomial Logit Models: Methodological and Managerial Issues , 1993 .

[13]  A. W. Hoogendoorn,et al.  Models for monthly penetrations with incomplete panel data , 1995 .

[14]  Robert A. Peterson,et al.  Customer Base Analysis: An Industrial Purchase Process Application , 1994 .

[15]  Terry Elrod,et al.  Choice Map: Inferring a Product-Market Map from Panel Data , 1988 .

[16]  Donald R. Lehmann,et al.  The Long-Term Impact of Promotion and Advertising on Consumer Brand Choice , 1997 .

[17]  Gary J. Russell,et al.  A Probabilistic Choice Model for Market Segmentation and Elasticity Structure , 1989 .

[18]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[19]  Philip Hans Franses,et al.  Modeling new product sales; an application of cointegration analysis☆ , 1994 .

[20]  Jan Roelf Bult,et al.  Semiparametric versus Parametric Classification Models: An Application to Direct Marketing , 1993 .

[21]  Dominique M. Hanssens,et al.  Empirical Generalizations About Market Evolution and Stationarity , 1995 .

[22]  Jay Magidson,et al.  Improved statistical techniques for response modeling , 1988 .

[23]  Michel Wedel,et al.  A Latent Class Poisson Regression Model for Heterogeneous Count Data , 1993 .

[24]  Vicki G. Morwitz,et al.  Testing New Direct Marketing Offerings: the Interplay of Management Judgment and Statistical Models , 1998 .

[25]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[26]  Terry Elrod,et al.  A Factor-Analytic Probit Model for Representing the Market Structure in Panel Data , 1995 .

[27]  Subrata K. Sen,et al.  Two essays in direct marketing , 1997 .

[28]  Vithala R. Rao,et al.  Selecting, evaluating, and updating prospects in direct mail marketing , 1995 .

[29]  L. Krishnamurthi,et al.  A Model of Brand Choice and Purchase Quantity Price Sensitivities , 1988 .

[30]  Sunil Gupta Impact of Sales Promotions on when, what, and how Much to Buy , 1988 .

[31]  Philip Hans Franses,et al.  Outlier robust analysis of long-run marketing effects for weekly scanning data , 1998 .

[32]  S. Siddarth,et al.  Determining Segmentation in Sales Response across Consumer Purchase Behaviors , 1998 .

[33]  Pradeep K. Chintagunta,et al.  Estimating a Multinomial Probit Model of Brand Choice Using the Method of Simulated Moments , 1992 .

[34]  Wagner A. Kamakura,et al.  Statistical Data Fusion for Cross-Tabulation , 1997 .

[35]  Pradeep K. Chintagunta,et al.  Heterogeneous Logit Model Implications for Brand Positioning , 1994 .

[36]  Gary J. Russell,et al.  Implications of Market Structure for Elasticity Structure , 1988 .

[37]  Ruth N. Bolton,et al.  A Multistage Model of Customers' Assessments of Service Quality and Value , 1991 .

[38]  William R. Dillon,et al.  A Segment-Level Model of Category Volume and Brand Choice , 1996 .

[39]  Peter E. Rossi,et al.  The Value of Purchase History Data in Target Marketing , 1996 .

[40]  L. Fahrmeir,et al.  Multivariate statistical modelling based on generalized linear models , 1994 .

[41]  J. M. Jones,et al.  Removing Heterogeneity Bias from Logit Model Estimation , 1988 .

[42]  J. Chiang A Simultaneous Approach to the Whether, What and How Much to Buy Questions , 1991 .

[43]  Amiya K. Basu,et al.  Modeling the Response Pattern to Direct Marketing Campaigns , 1995 .

[44]  Greg M. Allenby,et al.  On the Heterogeneity of Demand , 1998 .

[45]  James M. Lattin,et al.  A Two-State Model of Purchase Incidence and Brand Choice , 1991 .

[46]  Lee G. Cooper,et al.  Market-Share Analysis , 1988 .

[47]  Joel H. Steckel,et al.  A Heterogeneous Conditional Logit Model of Choice , 1988 .

[48]  Pradeep K. Chintagunta,et al.  Investigating Heterogeneity in Brand Preferences in Logit Models for Panel Data , 1991 .

[49]  J. R. Bult,et al.  Optimal Selection for Direct Mail , 1995 .

[50]  David C. Schmittlein,et al.  Counting Your Customers: Who-Are They and What Will They Do Next? , 1987 .

[51]  R. Oliver A Cognitive Model of the Antecedents and Consequences of Satisfaction Decisions , 1980 .

[52]  D. Cox Regression Models and Life-Tables , 1972 .

[53]  D. Wittink,et al.  Hierarchical versus other market share models for markets with many items , 1997 .

[54]  Philip Hans Frames Modeling new product sales; an application of cointegration analysis , 1994 .

[55]  James K. Lindsey,et al.  Parametric Statistical Inference , 1996 .

[56]  M. Wedel,et al.  Market Segmentation: Conceptual and Methodological Foundations , 1997 .

[57]  J. Heckman,et al.  A Method for Minimizing the Impact of Distributional Assumptions in Econometric Models for Duration Data , 1984 .

[58]  Chris J. Skinner,et al.  Analysis of complex surveys , 1991 .

[59]  Tom Wansbeek,et al.  Interaction between target and mailing characteristics in direct marketing, with an application to health care fund raising , 1997 .