Improving upon logistic regression to predict United States Army delayed entry program (DEP) losses

We improve upon McFadden's use of logistic regression for choice analysis. We investigate the use of neural networks, support vector machines, and random forest as functional approximations to improve upon the results obtained from logistic regression. The choice involves an Army enlisted applicant choosing between honoring their enlistment contract with the Army by shipping to basic combat training or choosing to not honor the contract and becoming a DEP Loss. An Army enlisted applicant is a person, who signs an active duty enlistment contract with the United States Army. The enlistment contract contains various terms, such as: the length of the service, the Army job (military occupational skill (MOS)), special schooling received by the applicant, and incentives. A shipper is an Army enlisted applicant, who initially honors their Army contract. A DEP Loss is an Army enlisted applicant, who doesn't honor their Army contract. We discover, for these data, both support vector machines and random forest outperform logistic regression. We also discover support vector machines outperforming all other functional approximations for these data. Performance is based on various metrics: error rate, type II error, and ROC curves