Regression with a binary independent variable subject to errors of observation
暂无分享,去创建一个
In a recent study of the socio-economic effects of the disease bilharzia on the population of the Caribbean island of St. Lucia, Weisbrod et al. (1973) make abundant use of regression methods to analyze the various relationships of interest. A common independent variable in their work is a dummy variable that indicates the presence or absence of the disease in the person sampled (the observation unit). The nature of diagnosis for this particular disease is such that if a person is diagnosed as having the disease, he does indeed have it. However, if the diagnosis is negative there is a non-zero probability, q, that he has been diagnosed incorrectly and has the disease. It is the intent of this note to consider the effects of such an independent variable - a binary variable subject td ‘errors of classification’ - in least squares regression. The results of both an analysis of bias in least squares parameter estimates and of the availability of alternative estimators is parallel to the classical case where both the variable and its measurement error are continuous random variables. Many details will not be repeated here in deference to the reader’s familiarity with that subject. The important practical difference between the two cases is that the information needed to obtain consistent (or nearly so) parameter estimates may be more readily available in the discrete case. In the study cited, for instance, extraneous information about q is available from patient medical histories and examination data. The following section of the present article contains a brief exposition of the nature of a discrete ‘classification error’. Previous authors who have treated this material include Neyman (1950), Bross (1954) and Lord and Novick (1968). Sect. 3 then takes up an analysis of the effects of including an independent variable subject to such measure
[1] Jerzy Neyman,et al. First course in probability and statistics , 1951 .
[2] M. R. Novick,et al. Statistical Theories of Mental Test Scores. , 1971 .
[3] J. Neyman. First course in probability and statistics , 1951 .
[4] R. Fisher. The Advanced Theory of Statistics , 1943, Nature.
[5] W. G. Cochran. Errors of Measurement in Statistics , 1968 .
[6] I. Bross. Misclassification in 2 X 2 Tables , 1954 .