How to read a paper: Papers that report diagnostic or screening tests

If you are new to the concept of validating diagnostic tests, the following example may help you. Ten men are awaiting trial for murder. Only three of them actually committed a murder; the seven others are innocent of any crime. A jury hears each case and finds six of the men guilty of murder. Two of the convicted are true murderers. Four men are wrongly imprisoned. One murderer walks free. PETER BROWN This information can be expressed in what is known as a two by two table (table 1). Note that the “truth” (whether or not the men really committed a murder) is expressed along the horizontal title row, whereas the jury's verdict (which may or may not reflect the truth) is expressed down the vertical row. View this table: Table 1 Two by two table showing outcome of trial for 10 men accused of murder These figures, if they are typical, reflect several features of this particular jury: These five features constitute, respectively, the sensitivity, specificity, positive predictive value, negative predictive value, and accuracy of this jury's performance. The rest of this article considers these five features applied to diagnostic (or screening) tests when compared with a “true” diagnosis or gold standard. A sixth feature—the likelihood ratio—is introduced at the end of the article. Our window cleaner told me that he had been feeling thirsty recently and had …

[1]  Gordon H. Guyatt,et al.  Users' Guides to the Medical Literature: III. How to Use an Article About a Diagnostic Test A. Are the Results of the Study Valid? , 1994 .

[2]  A. Henderson Test accuracy is example of redundant information , 1998, BMJ.

[3]  D. Sackett,et al.  The Ends of Human Life: Medical Ethics in a Liberal Polity , 1992, Annals of Internal Medicine.

[4]  L. M. Anderson Statistics with Confidence. Confidence Intervals and Statistical Guidelines , 1989 .

[5]  J. Loeber,et al.  [Evaluation of a decade of neonatal screening for congenital hypothyroidism in The Netherlands]. , 1993, Nederlands tijdschrift voor geneeskunde.

[6]  T. J. Fagan,et al.  Nomogram for Bayes's theorem , 1975 .

[7]  U. Ackermann-Liebrich,et al.  Peak flow variability in the SAPALDIA study and its validity in screening for asthma-related conditions. The SPALDIA Team. , 1999, American journal of respiratory and critical care medicine.

[8]  D. Rennie,et al.  Measuring the quality of trials: the quality of quality scales. , 1999, JAMA.

[9]  P. Bossuyt,et al.  Empirical evidence of design-related bias in studies of diagnostic tests. , 1999, JAMA.

[10]  A R Feinstein,et al.  Use of methodological standards in diagnostic test research. Getting better but still not good. , 1995, JAMA.

[11]  J. Lacey,et al.  The SCOFF questionnaire: assessment of a new screening tool for eating disorders , 1999, BMJ.

[12]  K. Svärdsudd,et al.  A Model for Early Diagnosis of Type 2 Diabetes Mellitus in Primary Health Care , 1993, Diabetic medicine : a journal of the British Diabetic Association.

[13]  G. Guyatt,et al.  Users' guides to the medical literature. , 1993, JAMA.

[14]  Fagan Tj Letter: Nomogram for Bayes theorem. , 1975 .

[15]  M. Aronson,et al.  Screening for alcohol abuse using the CAGE questionnaire. , 1987, The American journal of medicine.

[16]  G. Guyatt,et al.  Users' Guides to the Medical Literature: III. How to Use an Article About a Diagnostic Test: B. What Are the Results and Will They Help Me In Caring for My Patients? , 1994 .

[17]  R. Hanson,et al.  Comparison of tests for glycated haemoglobin and fasting and two hour plasma glucose concentrations as diagnostic methods for diabetes , 1994, BMJ.

[18]  Gordon H. Guyatt,et al.  Users' Guides to the Medical Literature: II. How to Use an Article About Therapy or Prevention B. What Were the Results and Will They Help Me in Caring for My Patients? , 1994 .

[19]  G. Guyatt,et al.  Diagnosis of iron-deficiency anemia in the elderly. , 1990, The American journal of medicine.

[20]  A. Kinmonth,et al.  Critical reading for primary care , 1995 .