A class of logistic‐type discriminant functions

In two-group discriminant analysis, the Neyman--Pearson Lemma establishes that the ROC, receiver operating characteristic, curve for an arbitrary linear function is everywhere below the ROC curve for the true likelihood ratio. The weighted area between these two curves can be used as a risk function for finding good discriminant functions. The weight function corresponds to the objective of the analysis, for example to minimise the expected cost of misclassification, or to maximise the area under the ROC. The resulting discriminant functions can be estimated by iteratively reweighted logistic regression. We investigate some asymptotic properties in the 'near-logistic' setting, where we assume the covariates have been chosen such that a linear function gives a reasonable, but not necessarily exact, approximation to the true log likelihood ratio. Some examples are discussed, including a study of medical diagnosis in breast cytology. Copyright Biometrika Trust 2002, Oxford University Press.

[1]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[2]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[3]  M. Stone,et al.  Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[4]  M. Stone Cross-validation:a review 2 , 1978 .

[5]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[6]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[7]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[8]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[9]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[10]  D. Collett,et al.  Modelling Binary Data , 1991 .

[11]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[12]  R. Carroll,et al.  Prospective Analysis of Logistic Case-Control Studies , 1995 .

[13]  C. Begg,et al.  Screening for cutaneous melanoma by skin self-examination. , 1996, Journal of the National Cancer Institute.

[14]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[15]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[16]  David J. Hand,et al.  Statistical Classification Methods in Consumer Credit Scoring: a Review , 1997 .

[17]  Colin B. Begg,et al.  A New Strategy for Evaluating the Impact of Epidemiologic Risk Factors for Cancer with Application to Melanoma , 1998 .

[18]  Mark R. Wade,et al.  Construction and Assessment of Classification Rules , 1999, Technometrics.

[19]  Palmer Encyclopedia of biostatistics , 1999, BMJ.

[20]  J. Copas The Effectiveness of Risk Scores: the Logit Rank Plot , 1999 .

[21]  A. Girling Rank statistics expressible as integrals under P–P‐plots and receiver operating characteristic curves , 2000 .

[22]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[23]  Marcos Dipinto,et al.  Discriminant analysis , 2020, Predictive Analytics.

[24]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.