Isotonic single-index model for high-dimensional database marketing

While database marketers collect vast amounts of customer transaction data, its utilization to improve marketing decisions presents problems. Marketers seek to extract relevant information from large databases by identifying signi6cant variables and prospective customers. In small databases, they could calibrate logistic regression models via maximum-likelihood methods to determine signi6cant variables and assess customer’s response probability. For large databases, however, this approach becomes computationally too intensive to implement in real-time, and so marketers prefer estimation methods that are scalable to high-dimensional databases. In addition, database marketing is practiced in diverse product-markets, and so marketers prefer probability models that are #exible rather than restrict to speci6c distributions (e.g., logistic). To incorporate scalability and 9exibility, we propose isotonic single-index models for database marketing. It furnishes the 6rst projective approximation to a general p-variate function. Its link function is order-preserving (i.e., isotonic), thus encompassing all proper distribution functions. We develop a direct approach for its estimation: we 6rst estimate the orientation of high-dimensional parameter vector without specifying the link function (via sliced inverse regression), and then estimate the non-decreasing link function (via isotonic regression). We illustrate its practical use by analyzing a high-dimensional customer transaction database. This approach yields dimension reduction both column- and row-wise; that is, we not only discover signi6cant variables in a large transaction database, but also prioritize customers into a few distinct groups based on estimated response probability (to enable direct mailing of catalogs). c 2003 Elsevier B.V. All rights reserved.

[1]  H. D. Brunk,et al.  AN EMPIRICAL DISTRIBUTION FUNCTION FOR SAMPLING WITH INCOMPLETE INFORMATION , 1955 .

[2]  Michel Wedel,et al.  Modeling large data sets in marketing , 2001 .

[3]  M. Wedel,et al.  Market Segmentation: Conceptual and Methodological Foundations , 1997 .

[4]  R. Brodie,et al.  Building models for marketing decisions , 2000 .

[5]  C SchmittleinDavid,et al.  Counting Your Customers , 1987 .

[6]  P. Hall,et al.  NONPARAMETRIC KERNEL REGRESSION SUBJECT TO MONOTONICITY CONSTRAINTS , 2001 .

[7]  R. Cook,et al.  Dimension Reduction in Binary Response Regression , 1999 .

[8]  Peter J. Bickel,et al.  A Festschrift for Erich L. Lehmann in honor of his sixty-fifth birthday , 1983 .

[9]  Qi Li,et al.  SEMIPARAMETRIC METHODS IN ECONOMETRICS , 2000, Econometric Theory.

[10]  ScienceDirect Computational statistics & data analysis , 1983 .

[11]  Alice M. Tybout,et al.  Impact of Deals and Deal Retraction on Brand Switching , 1978 .

[12]  David R. Cox The analysis of binary data , 1970 .

[13]  J. Kiefer,et al.  CONSISTENCY OF THE MAXIMUM LIKELIHOOD ESTIMATOR IN THE PRESENCE OF INFINITELY MANY INCIDENTAL PARAMETERS , 1956 .

[14]  Füsun F. Gönül,et al.  Optimal Mailing of Catalogs: a New Methodology Using Estimable Structural Dynamic Programming Models , 1998 .

[15]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[16]  D. Cox,et al.  Analysis of Binary Data (2nd ed.). , 1990 .

[17]  Chun-Houh Chen,et al.  CAN SIR BE AS POPULAR AS MULTIPLE LINEAR REGRESSION , 2003 .

[18]  H. Ichimura,et al.  SEMIPARAMETRIC LEAST SQUARES (SLS) AND WEIGHTED SLS ESTIMATION OF SINGLE-INDEX MODELS , 1993 .

[19]  J. R. Bult,et al.  Optimal Selection for Direct Mail , 1995 .

[20]  David C. Schmittlein,et al.  Counting Your Customers: Who-Are They and What Will They Do Next? , 1987 .

[21]  I. Jolliffe,et al.  Nonlinear Multivariate Analysis , 1992 .

[22]  R. Cook,et al.  Reweighting to Achieve Elliptically Contoured Covariates in Regression , 1994 .

[23]  Ker-Chau Li,et al.  Slicing Regression: A Link-Free Regression Method , 1991 .

[24]  S. Weisberg,et al.  Comments on "Sliced inverse regression for dimension reduction" by K. C. Li , 1991 .

[25]  D. Wittink,et al.  Building Models for Marketing Decisions , 2000 .

[26]  King C. P. Li High dimensional data analysis via the sir/phd approach , 2000 .

[27]  Thomas M. Stoker,et al.  Semiparametric Estimation of Index Coefficients , 1989 .

[28]  M. Hill,et al.  Nonlinear Multivariate Analysis. , 1990 .

[29]  A. Wald Note on the Consistency of the Maximum Likelihood Estimate , 1949 .

[30]  Byung-Do Kim,et al.  Mailing smarter to catalog customers , 2000 .

[31]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[32]  Prasad A. Naik,et al.  Single‐index model selections , 2001 .

[33]  Jianqing Fan,et al.  Generalized Partially Linear Single-Index Models , 1997 .

[34]  Prasad A. Naik,et al.  Understanding the Impact of Synergy in Multimedia Communications , 2003 .

[35]  P. M. E. Altham,et al.  Improving the Precision of Estimation by Fitting a Model , 1984 .

[36]  H. D. Brunk,et al.  Statistical inference under order restrictions : the theory and application of isotonic regression , 1973 .

[37]  J. Simonoff Smoothing Methods in Statistics , 1998 .

[38]  S. Cosslett DISTRIBUTION-FREE MAXIMUM LIKELIHOOD ESTIMATOR OF THE BINARY CHOICE MODEL1 , 1983 .

[39]  Susana V. Mondschein,et al.  Mailing Decisions in the Catalog Sales Industry , 1996 .

[40]  Prasad A. Naik,et al.  Partial least squares estimator for single‐index models , 2000 .

[41]  Jérôme Saracco,et al.  An asymptotic theory for sliced inverse regression , 1997 .