Estimation of disease probabilities conditioned on symptom variables

Abstract The use in single-stage medical diagnosis of a recent, nearly assumption-free method of statistical discrimination is explored. The method involves the application of Bayes' theorem to linearly smoothed, joint-symptom-probability estimates. The linear smoothing filter is optimized by an empirical performance-scoring technique that employs a single sample for both development and verification. Advantages of the method: acceptability of a wide variety of nonnormal symptom variables, ranging from dichotomous to seemingly continuous; lack of necessity for independence of symptom variables; simultaneous use of a large number of variables, on the order of magnitude of the number of patients observed; capacity to handle missing observations by creating for each incompletely observed variable a dichotomous dummy variable that indicates the missing values. Disadvantage: lack of automatic identification of important variables, as is obtained when the standard discriminant-function method applies.