Bayesian Analysis of Binary Data Subject to Misclassification

This paper considers estimation of success probabilities of categorical binary data subject to mis-classiication errors from the Bayesian point of view. It has been shown by Bross (1954) that sample proportions are in general biased estimates. This bias is a function of the amount of misclassiication and can be substantial. Tenenbein (1970) proposed to eliminate the bias by subjecting a portion of the sample to both true and fallible classiiers, resulting in a 2 x 2 table, from which the misclassi-cation rates can be estimated. The rationale is that fallible classiiers are inexpensive relative to infallible ones. Hence if only a part of the sample is measured by the infallible classiier one can obtain a more eecient estimate, for a given sampling budget, than by measuring the whole sample using the infallible classiier. In many contexts an infallible classiier is unavailable or prohibitively expensive. Bayesian methods then provide a useful approach for dealing with the consequent nonidentiiability problems which arise when we want to carry out inference. In this paper we treat both the single measurement and the repeated measurements (where the former is a special case of the latter) from a Bayesian point of view. The posterior analyses are carried out using both Gauss-Jacobi quadrature and Gibbs sampling. Through examples it is shown that in most cases Gauss-Jacobi quadrature produces very good approximations, both in terms of accuracy and speed of computation. The Gibbs sampler requires more computation to reach the same level of accuracy as the Gauss-Jacobi.

[1]  A. Stroud,et al.  Gaussian quadrature formulas , 1966 .

[2]  A. Tenenbein A Double Sampling Scheme for Estimating from Binomial Data with Misclassifications: Sample size Determination , 1971 .

[3]  A. Tenenbein A Double Sampling Scheme for Estimating from Misclassified Multinomial Data with Applications to Sampling Inspection , 1972 .

[4]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[5]  W. Deming An Essay on Screening, or on Two-Phase Sampling, Applied to Surveys of a Community , 1977 .

[6]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  A. H. Stroud,et al.  Methods of Numerical Integration—Second Edition (Philip J. Davis and Philip Rabinowitz) , 1986 .

[8]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[9]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[10]  S. E. Hills,et al.  Illustration of Bayesian Inference in Normal Data Models Using Gibbs Sampling , 1990 .

[11]  J. Geweke,et al.  A Bayesian Method for Evaluating Medical Test operating Characteristics When Some Patients' Conditions Fail to Be Diagnosed by the Reference Standard , 1990, Medical decision making : an international journal of the Society for Medical Decision Making.

[12]  I. Olkin,et al.  Numerical Aspects in Estimating the Parameters of a Mixture of Normal Distributions , 1992 .

[13]  Robert L. Winkler,et al.  Implications of errors in survey data: a Bayesian model , 1992 .