Partial Separation in Logistic Discrimination

SUMMARY The problem of maximum likelihood estimates in logistic discrimination is receiving growing attention in the literature. The existence is known to be highly dependent on the data configuration observed. Here we focus on the practically important case where the group samples form a set of clusters that are completely separated from each other, which we call 'partial separation'. The classical Fisher iris data set is a good example of the problem discussed. A non-existence theorem is proven and a general algorithm is developed to help to distinguish separation from multicollinearity given divergence in maximum likelihood estimation.