An assessment of data quality in the Vermont-Oxford Trials Network database.

The Vermont-Oxford Trials Network is a voluntary collaborative research group of neonatologists that maintains a database for very low birthweight infants (501-1500 g). The database (1) provides core data for randomized trials, (2) serves as a resource for outcomes research in neonatology, and (3) generates quality management reports for participating sites. To assess the reliability of this database and to determine the sources of error, we reviewed 635 medical records chosen at random from among the 4341 eligible infants born at 40 participating data generating sites during an 18-month period beginning January 1, 1990. The estimated frequencies of disagreement between the medical record and database for each of the 10 data items studied and the standard errors of the estimates (in parentheses) were: date of birth 1.3% (0.4), date of admission 2.5% (0.6), date of discharge 8.8% (1.0), birthweight (difference > 50 g) 2.9% (0.6), location of birth (inborn or outborn) 2.1% (0.5), multiple birth 2.2% (0.5), cesarean section 2.5% (0.6), gender 2.1% (0.5), status 28 days after birth 3.4% (0.6), final status 2.9% (0.6). The overall proportions and mean values for items in the database were close to the estimated values based on the random sample of records. There were a total of 247 disagreements between the database and the medical records in the sample. Twenty-three were due to data keying errors. Two hundred twenty-four were due to errors in transcription or interpretation. The rate of data keying errors decreased from over 50 errors per 10,000 fields to less than 15 errors per 10,000 fields when specific quality control procedures, including visual inspection, were instituted. Data keying errors accounted for 13.7% of all disagreements between the database and medical record before improved data entry methods were introduced, and only 3.7% of all errors after they were introduced. We concluded that the Vermont-Oxford Trials Network Database is reliable. Data keying errors have been reduced by the introduction of additional quality control measures. Further reductions in database errors will require measures aimed at minimizing transcription or interpretation errors by individuals completing the data forms.