Data quality in the Norwegian dairy herd recording system: agreement between the national database and disease recording on farm.

The majority of herds in Norway participate in the national dairy herd recording system. For disease events, this involves transferring information registered on farm, using individual cow health cards (CHC), to the central cattle database (CCD). Before using data from such a database, validation with an aim of describing data quality should be performed, but is rarely done. In this study, diagnostic events from CHC and CCD from 74 dairy herds were compared. Events in 2008 from female cattle with minimum age of 1 yr were included (n=1,738). Discrepancies between the 2 data sources and assessment of data quality were evaluated using agreement between events on CHC and in CCD, calculating completeness and correctness for the CCD, and using a multivariable regression model for agreement (1/0). The agreement evaluation described the concordance between the 2 data sources, whereas the calculations of completeness and correctness depended on a reference data source assumed to be more reliable. Completeness of the CCD was defined as the proportion of diagnostic events on the CHC that was recorded therein. Correctness was defined as the proportion of the CCD events that was also recorded on the CHC, and with the same date and diagnostic code. The agreement was up to 87.5%, the majority of disagreement being caused by unreported events on the CHC (between 10 and 12% of all events). Completeness of the CCD was regarded as high, between 0.87 and 0.88, and correctness excellent, between 0.97 and 0.98. The multivariable regression model found 4 factors that increased the odds for diagnostic events being in agreement between CHC and CCD. These were the events occurring during the 305-d lactation period; the herd size being 75 cows or less; the event occurring during the spring, summer, or winter rather than autumn; and lastly, the diagnostic code for the disease event being preprinted on the CHC, involving a simple check mark as opposed to writing a 3-digit code. The model found a high degree of clustering within herd. In conclusion, disease data in the Norwegian national database for dairy cows are valid to use for epidemiologic research, having in particular an excellent correctness, but it is of concern that at least 10% of data are missing. The proportion of unreported data should be taken into consideration whenever data from this database are used. Reasons for discrepancies found are important to be aware of in any work aiming to improve data transfer from farm to central databases.

[1]  A. Virtala,et al.  Validation of the Finnish national dairy disease register--data transfer from cow health cards to the disease register. , 2012, Journal of dairy science.

[2]  V. Edge,et al.  Mastitis, ketosis, and milk fever in 31 organic and 93 conventional Norwegian dairy herds. , 2001, Journal of dairy science.

[3]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[4]  Kathleen V. Diegert,et al.  A Hierarchical Approach to Improving Data Quality , 1998, Data Qual..

[5]  Peter Croft,et al.  Quality of morbidity coding in general practice computerized medical records: a systematic review. , 2004, Family practice.

[6]  E. Ropstad,et al.  Pregnancy incidence in Norwegian red cows using nonreturn to estrus, rectal palpation, pregnancy-associated glycoproteins, and progesterone. , 2008, Journal of dairy science.

[7]  Effect of concentrate escalation postpartum on the shape of the lactation curve and health parameters of Norwegian dairy cattle , 2012 .

[8]  O. Østerås,et al.  Calf health monitoring in Norwegian dairy herds. , 2009, Journal of dairy science.

[9]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[10]  Nicolette de Keizer,et al.  Model Formulation: Defining and Improving Data Quality in Medical Registries: A Literature Review, Case Study, and Generic Framework , 2002, J. Am. Medical Informatics Assoc..

[11]  A. Lindberg,et al.  Comparison between dairy cow disease incidence in data registered by farmers and in data from a disease-recording system based on veterinary reporting , 2009, Preventive Veterinary Medicine.

[12]  O. Østerås,et al.  Disease Recording Systems and Herd Health Schemes for Production Diseases , 2001, Acta veterinaria Scandinavica. Supplementum.

[13]  O. Østerås,et al.  Bovine claw and limb disorders related to reproductive performance and production diseases. , 2006, Journal of dairy science.

[14]  K. Thiru,et al.  Systematic review of scope and quality of electronic patient record data in primary care , 2003, BMJ : British Medical Journal.

[15]  G M Leydon,et al.  Validation of the diagnosis of venous thromboembolism in general practice database studies. , 2000, British journal of clinical pharmacology.

[16]  S. Martin,et al.  Veterinary Epidemiologic Research , 2009 .

[17]  Spatial patterns of recorded mastitis incidence and somatic cell counts in Swedish dairy cows: implications for surveillance. , 2011, Geospatial health.

[18]  Michael M. Wagner,et al.  Review: Accuracy of Data in Computer-based Patient Records , 1997, J. Am. Medical Informatics Assoc..

[19]  A. C. Whist,et al.  Associations between somatic cell counts at calving or prior to drying-off and clinical mastitis in the remaining or subsequent lactation , 2006, Journal of Dairy Research.

[20]  Validation of Nordic dairy cattle disease recording databases--completeness for locomotor disorders. , 2012, Preventive veterinary medicine.

[21]  A. Lindberg,et al.  Validation of a national disease recording system for dairy cattle against veterinary practice records. , 2010, Preventive veterinary medicine.

[22]  O. Østerås,et al.  Completeness of metabolic disease recordings in Nordic national databases for dairy cows. , 2012, Preventive veterinary medicine.

[23]  J. Penell,et al.  Validation of computerized Swedish horse insurance data against veterinary clinical records. , 2007, Preventive veterinary medicine.

[24]  O. Østerås,et al.  Results and evaluation of thirty years of health recordings in the Norwegian dairy cattle population. , 2007, Journal of dairy science.

[25]  H T Sorensen,et al.  A framework for evaluation of secondary data sources for epidemiological research. , 1996, International journal of epidemiology.

[26]  H. Houe,et al.  Incidence of clinical mastitis in Danish dairy cattle and screening for non-reporting in a passively collected national surveillance system. , 2001, Preventive veterinary medicine.

[27]  U. Emanuelson,et al.  Canine atopic dermatitis: validation of recorded diagnosis against practice records in 335 insured Swedish dogs , 2006, Acta veterinaria Scandinavica.

[28]  Å. Hedhammar,et al.  Validation of computerized Swedish dog and cat insurance data against veterinary practice records. , 1998, Preventive veterinary medicine.

[29]  D Gianola,et al.  Multivariate threshold model analysis of clinical mastitis in multiparous norwegian dairy cattle. , 2004, Journal of dairy science.

[30]  J. Penell,et al.  Validation of computerized diagnostic information in a clinical database from a national equine clinic network , 2009, Acta veterinaria Scandinavica.

[31]  S. Martin,et al.  Quality of computerized medical record abstract data at a veterinary teaching hospital , 1996 .

[32]  Development of a computerized dairy herd health data base for epidemiologic research , 1986 .

[33]  A. Lindberg,et al.  Completeness of the disease recording systems for dairy cows in Denmark, Finland, Norway and Sweden with special reference to clinical mastitis , 2012, BMC Veterinary Research.

[34]  J. Wenz,et al.  Retrospective evaluation of health event data recording on 50 dairies using Dairy Comp 305. , 2012, Journal of Dairy Science.

[35]  The usefulness of the computerized medical records of one practice for research into pregnancy loss in dairy cows , 1994 .