Selection of Transformations of Continuous Predictors in Logistic Regression

The binary logistic regression is a machine learning tool for classification and discrimination that is widely used in business analytics and medical research. Transforming continuous predictors to improve model performance of logistic regression is a common practice, but no systematic method for finding optimal transformations exists in the statistical or data mining literature. In this paper, the problem of selecting transformations of continuous predictors to improve the performance of logistic regression models is considered. The proposed method is based upon the point-biserial correlation coefficient between the binary response and a continuous predictor. Several examples are presented to illustrate the proposed method.