Automated Cellular Modeling and Prediction on a Large Scale

We describe CHAMP (CHurn Analysis, Modeling, andPrediction), an automated system for modeling cellularsubscriber churn that is predicting which customerswill discontinue cellular phone service. We describevarious issues related to developing and deployingthis system including automating data access from aremote data warehouse, preprocessing, featureselection, model validation, and optimization toreflect business tradeoffs. Using data from GTE'sdata warehouse for cellular phone customers, CHAMP iscapable of developing churn models customized byregion for over one hundred GTE cellular phone marketstotaling over 5 million customers. Every month churnfactors are identified for each geographic region andmodels are updated to generate churn scores predictingwho is likely to churn in the short term. Learningmethods such as decision trees and genetic algorithmsare used for feature selection and a cascade neuralnetwork is used for predicting churn scores. Inaddition to producing churn scores, CHAMP alsoproduces qualitative results in the form of rules andcomparison of market trends that are disseminatedthrough a web based interface.

[1]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[2]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[3]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[4]  Gregory Piatetsky-Shapiro,et al.  Selecting and reporting What Is Interesting , 1996, Advances in Knowledge Discovery and Data Mining.

[5]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[6]  Lee A. Feldkamp,et al.  Decoupled extended Kalman filter training of feedforward layered networks , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[7]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[8]  Jason Catlett,et al.  Megainduction: A Test Flight , 1991, ML.

[9]  Gregory Piatetsky-Shapiro,et al.  A Comparison of Approaches for Maximizing Business Payoff of Prediction Models , 1996, KDD.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Donald Perlis,et al.  Explicitly biased generalization , 1989, Comput. Intell..

[12]  Ron Kohavi,et al.  MLC++: a machine learning library in C++ , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[13]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.