Wrapped Feature Selection For BinaryClassification Bayesian Regularisation NeuralNetworks: A Database Marketing Application

In this paper, we try to validate existing theory on and develop additional insight into repeat purchasing behaviour in a direct-marketing setting by means of an illuminating case study. The case involves the detection and qualification of the most relevant RFM (Recency, Frequency and Monetary) features, using a wrapped feature selection method in a neural network context. Results indicate that elimination of redundant/irrelevant features by means of the discussed feature selection method, allows to significantly reduce model complexity without degrading generalisation ability. It is precisely this issue that will allow to infer some very interesting marketing conclusions concerning the relative importance of the RFM-predictor categories. The empirical findings highlight the importance of a combined use of all three RFM variables in predicting repeat purchase behaviour. However, the study also reveals the dominant role of the frequency variable. Results indicate that a model including only frequency variables still yields satisfactory classification accuracy compared to the optimally reduced model.

[1]  Martin T. Hagan,et al.  Gauss-Newton approximation to Bayesian learning , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[2]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[3]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[4]  H. R. van der Scheer,et al.  Quantitative approaches for profit maximization in direct marketing , 1998 .

[5]  Edward L. Nash Direct Marketing: Strategy, Planning, Execution , 1982 .

[6]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[7]  Dirk Van den Poel Response modeling for database marketing using binary classification , 1999 .

[8]  F. F. Reichheld Zero Defection ; Quality Comes to Service , 1990 .

[9]  John Moody,et al.  Note on generalization, regularization and architecture selection in nonlinear learning systems , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[10]  J. R. Bult,et al.  Optimal Selection for Direct Mail , 1995 .

[11]  John E. Moody,et al.  Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction , 1991, NIPS.

[12]  F. F. Reichheld,et al.  Zero defections: quality comes to services. , 1990, Harvard business review.

[13]  Roderick J. A. Little Regression with Missing X's: A Review , 1992 .

[14]  Nissan Levin,et al.  Issues and problems in applying neural computing to target marketing , 1997 .

[15]  A. Grant,et al.  Realize Your Customers' Full Profit Potential , 1995 .

[16]  Nissan Levin,et al.  Continuous Predictive Modeling—A Comparative Analysis , 1998 .

[17]  Tom Heskes,et al.  Partial Retraining: A New Approach to Input Relevance Determination , 1999, Int. J. Neural Syst..

[18]  J. Swets Indices of discrimination or diagnostic accuracy: their ROCs and implied models. , 1986, Psychological bulletin.

[19]  Z. Degraeve,et al.  The attrition of volunteers , 1997 .

[20]  Jan Roelf Bult Target selection for direct marketing. , 1993 .

[21]  Dirk Van den Poel,et al.  Database marketing modelling for financial services using hazard rate models , 1998 .

[22]  C. Bauer A direct mail customer purchase model , 1988 .

[23]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.