Two of the most widely used statistical methods for analyzing categorical outcome variables are linear discriminant analysis and logistic regression. While both are appropriate for the development of linear classification models, linear discriminant analysis makes more assumptions about the underlying data. Hence, it is assumed that logistic regression is the more flexible and more robust method in case of violations of these assumptions. In this paper we consider the problem of choosing between the two methods, and set some guidelines for proper choice. The comparison between the methods is based on several measures of predictive accuracy. The performance of the methods is studied by simulations. We start with an example where all the assumptions of the linear discriminant analysis are satisfied and observe the impact of changes regarding the sample size, covariance matrix, Mahalanobis distance and direction of distance between group means. Next, we compare the robustness of the methods towards categorisation and non-normality of explanatory variables in a closely controlled way. We show that the results of LDA and LR are close whenever the normality assumptions are not too badly violated, and set some guidelines for recognizing these situations. We discuss the inappropriateness of LDA in all other cases.
[1]
David W. Hosmer,et al.
Applied Logistic Regression
,
1991
.
[2]
N. L. Johnson,et al.
Multivariate Analysis
,
1958,
Nature.
[3]
J. Anderson.
Separate sample logistic discrimination
,
1972
.
[4]
David R. Cox.
The analysis of binary data
,
1970
.
[5]
Charles E. Heckler,et al.
Applied Multivariate Statistical Analysis
,
2005,
Technometrics.
[6]
N. Giri.
Multivariate Statistical Analysis : Revised And Expanded
,
2003
.
[7]
Mark T. D. Cronin,et al.
The use of discriminant analysis, logistic regression and classification tree analysis in the development of classification models for human health effects
,
2003
.
[8]
D. F. Morrison,et al.
Multivariate Statistical Methods
,
1968
.
[9]
D. Cox,et al.
An Analysis of Transformations
,
1964
.