Ray Chambers, University of Wollongong James Chipperfield, Australian Bureau of Statistics Walter Davis, Statistics New Zealand Milorad Kovacevic, Statistics Canada Abstract Data obtained after probability linkage of administrative registers will include errors due to the fact that some linked records contain data items sourced from different individuals. Such errors can induce bias in standard statistical analyses if ignored. In this paper we describe some approaches to eliminating this bias when parametric inference is based on solution of an estimating equation, with an emphasis on linear and logistic regression analysis. Simulation results that illustrate the gains from allowing for linkage error when using probabilistically linked data to carry out these analyses are presented, as are extensions of the approach to more complex linkage situations. In particular, we explore issues that arise when sample records are linked to administrative records and also where the target of inference is the solution to the estimating equation defined by the perfectly linked data. A substantial application that illustrates the use of these ideas in identifying the major sources of error when modeling data obtained by probabilistically linking two successive Australian censuses is described.
[1]
Ray Chambers,et al.
Regression Analysis of Probability-Linked Data
,
2009
.
[2]
P. Lahiri,et al.
Regression Analysis With Linked Data
,
2005
.
[3]
William E. Winkler,et al.
Data quality and record linkage techniques
,
2007
.
[4]
Ivan P. Fellegi,et al.
A Theory for Record Linkage
,
1969
.
[5]
John Neter,et al.
The Effect of Mismatching on the Measurement of Response Errors
,
1965
.
[6]
Fritz Scheuren,et al.
Regression Analysis of Data Files that Are Computer Matched
,
1993
.
[7]
David A. Binder,et al.
Design-Based and Model-Based Methods for Estimating Model Parameters
,
2003
.
[8]
P. Ivax,et al.
A THEORY FOR RECORD LINKAGE
,
2004
.
[9]
D. Manuel,et al.
Using a linked data set to determine the factors associated with utilization and costs of family physician services in Ontario: effects of self-reported chronic conditions.
,
2003,
Chronic diseases in Canada.
[10]
G. W. Hill,et al.
Analysis of survey data
,
1996
.