Data Mining and Causal Modeling of Customer Behaviors

This paper shows how to apply data-mining and modeling methods to learn predictive models of customer behaviors from survey and behavioral data. The models predict transition rates of individual customers among states, including product adds and drops and account attrition rates. A key insight is that classification tree algorithms from data mining can be used to test conditional independence (Cl) relations among variables in large multivariate data sets. This suggests constructive techniques for (a) Building causal graph models from data; and (b) Using data to define the states of a dynamic transition process. The resulting models can be used to help optimize product offers, forecast demand for products, and plan marketing campaigns. We use several real data sets to illustrate how to: (a) Develop predictive models from survey data and from billing data, (b) Validate model assumptions by using classification trees to identify and test conditional independence relations, (c) Evaluate model performance compared to other (e.g., logistic regression or discriminant analysis) models using cross-validation, and (d) Recommend the next logical product to offer to each customer and the best customers to target for each product in order to maximize sales.