Defining datasets and creating data dictionaries for quality improvement and research in chronic disease using routinely collected data: an ontology-driven approach.

BACKGROUND The burden of chronic disease is increasing, and research and quality improvement will be less effective if case finding strategies are suboptimal. OBJECTIVE To describe an ontology-driven approach to case finding in chronic disease and how this approach can be used to create a data dictionary and make the codes used in case finding transparent. METHOD A five-step process: (1) identifying a reference coding system or terminology; (2) using an ontology-driven approach to identify cases; (3) developing metadata that can be used to identify the extracted data; (4) mapping the extracted data to the reference terminology; and (5) creating the data dictionary. RESULTS Hypertension is presented as an exemplar. A patient with hypertension can be represented by a range of codes including diagnostic, history and administrative. Metadata can link the coding system and data extraction queries to the correct data mapping and translation tool, which then maps it to the equivalent code in the reference terminology. The code extracted, the term, its domain and subdomain, and the name of the data extraction query can then be automatically grouped and published online as a readily searchable data dictionary. An exemplar online is: www.clininf.eu/qickd-data-dictionary.html CONCLUSION Adopting an ontology-driven approach to case finding could improve the quality of disease registers and of research based on routine data. It would offer considerable advantages over using limited datasets to define cases. This approach should be considered by those involved in research and quality improvement projects which utilise routine data.

[1]  Samuel P. Midkiff,et al.  High Performance Computing with the Array Package for Java: A Case Study using Data Mining , 1999, SC.

[2]  Tom Chan,et al.  Problems with primary care data quality: osteoporosis as an exemplar. , 2004, Informatics in primary care.

[3]  S. de Lusignan,et al.  Referral to a new psychological therapy service is associated with reduced utilisation of healthcare and sickness absence by people with common mental health problems: a before and after comparison , 2011, Journal of Epidemiology & Community Health.

[4]  C. Weel,et al.  The use of routinely collected computer data for research in primary care: opportunities and challenges. , 2006, Family practice.

[5]  Russell W. Quong,et al.  ANTLR: A predicated‐LL(k) parser generator , 1995, Softw. Pract. Exp..

[6]  Paul Krause,et al.  The Hayes principles: learning from the national pilot of information technology and core generalisable theory in informatics. , 2010, Informatics in primary care.

[7]  Simon de Lusignan,et al.  Achieving benefit for patients in primary care informatics: the report of a international consensus workshop at Medinfo 2007. , 2007, Informatics in primary care.

[8]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[9]  S de Lusignan,et al.  Key Concepts to Assess the Readiness of Data for International Research: Data Quality, Lineage and Provenance, Extraction and Processing Errors, Traceability, and Curation , 2011, Yearbook of Medical Informatics.

[10]  Simon de Lusignan,et al.  A system for solution-orientated reporting of errors associated with the extraction of routinely collected clinical data for research and quality improvement , 2010, MedInfo.

[11]  S. de Lusignan Codes, classifications, terminologies and nomenclatures: definition, development and application in practice. , 2005, Informatics in primary care.

[12]  J. Anderson Data Dictionaries - A Way Forward to Write Meaning and Terminology into Medical Information Systems , 1986, Methods of Information in Medicine.

[13]  S. de Lusignan,et al.  Miscoding, misclassification and misdiagnosis of diabetes in primary care , 2012, Diabetic medicine : a journal of the British Diabetic Association.

[14]  Christopher G Chute,et al.  National Center for Biomedical Ontology: advancing biomedicine through structured organization of scientific knowledge. , 2006, Omics : a journal of integrative biology.

[15]  C. Bradley,et al.  Informing the development of a national diabetes register in Ireland: a literature review of the impact of patient registration on diabetes care. , 2010, Informatics in primary care.

[16]  C. Boult,et al.  Improving chronic care: the "guided care" model. , 2008, The Permanente journal.

[17]  Werner Ceusters,et al.  Ontological realism: A methodology for coordinated evolution of scientific ontologies , 2010, Appl. Ontology.

[18]  A-R Sadek,et al.  Automated identification of miscoded and misclassified cases of diabetes from computer records , 2012, Diabetic medicine : a journal of the British Diabetic Association.

[19]  S. de Lusignan,et al.  A system of metadata to control the process of query, aggregating, cleaning and analysing large datasets of primary care data. , 2005, Informatics in primary care.

[20]  S. de Lusignan,et al.  Variation in the recording of diabetes diagnostic data in primary care computer systems: implications for the quality of care. , 2009, Informatics in primary care.

[21]  J G Williams,et al.  Measuring the Completeness and Currency of Codified Clinical Information , 2003, Methods of Information in Medicine.

[22]  K. Thiru,et al.  Systematic review of scope and quality of electronic patient record data in primary care , 2003, BMJ : British Medical Journal.

[23]  J. Car,et al.  Practice size, caseload, deprivation and quality of care of patients with coronary heart disease, hypertension and stroke in primary care: national cross-sectional study , 2007, BMC Health Services Research.

[24]  Régis Duvauferrier,et al.  Ontology and medical diagnosis , 2012, Informatics for health & social care.

[25]  Jung-ran Park Metadata Quality in Digital Repositories: A Survey of the Current State of the Art , 2009 .

[26]  Victoria Warmington,et al.  Info-tsunami: surviving the storm with data quality probes. , 2003, Informatics in primary care.

[27]  S. de Lusignan,et al.  Effect of pay for performance on hypertension in the United kingdom. , 2011, American journal of kidney diseases : the official journal of the National Kidney Foundation.

[28]  A. Philalithis,et al.  Designing a multifaceted quality improvement intervention in primary care in a country where general practice is seeking recognition: the case of Cyprus , 2008, BMC health services research.

[29]  Shamkant B. Navathe,et al.  Role of data dictionaries in information resource management , 1986, Inf. Manag..

[30]  S. de Lusignan,et al.  Addressing modifiable risk factors for coronary heart disease in primary care: an evidence-base lost in translation. , 2010, Family practice.