Maximum Shannon Information Content of Diagnostic Medical Testing

The increase in Shannon information available from a diagnostic test associated with grading of the test results into many outcomes, rather than simply positive or negative, was examined to determine its upper limit as the number of test outcomes is increased indefinitely. Numerical methods were employed to find the optimal locations of outcome boundaries when a single normally distributed test variable is classified into 2, 3, 4, 5, 6, 8, 14, or 20 outcome categories. In each case Shannon information was computed for values of prior probability between 0.01 and 0.99 and for distances between the means in diseased and nondiseased populations ranging from 0.5 to 5.0 standard deviations. There is an important improvement in Shannon information as the number of outcomes defined is increased, but the increment in information diminishes rapidly with each additional category. A 20%-30% increment in information may be achieved with three outcomes instead of two. A further important increase in information occurs with four to seven outcomes, but beyond this the increment in inforation is negligible. The findings were similar over a wide range of prior probabilities and distances between the means. The analysis was extended to the case of multiple nonindependent tests by demonstrating their application to a Fisher discriminant function incorporating such tests. It was concluded that for normally distributed test variables: grading of test results significantly improves the information content of both single and multiple tests; the value of information content for 8-20 outcomes represents very nearly the maximum information content of a test; there is little value in using more than five to seven test outcomes; multiple grading should not be neglected for discriminant functions.