Mining electronic health records: an additional perspective

We read with great interest the article by Jensen et al. (Mining electronic health records: towards better research applications and clinical care. Nature Reviews Genetics 13, 395–405)1. This was a well-written Review that summarized a large, complex and topical subject. To augment the article, and in particular to augment Table 1, we would like to point out that one of the earliest and most successful research databases that integrated diverse data sources with electronic health records (EHRs) is the Utah Population Database (UPDB) at the University of Utah, USA. Its earliest success was the identification of families with a high incidence of breast cancer; this research led to the discovery of the breast cancer genes BRCA1 and BRCA2 (Refs 2,3,4,5). A crucial component of the UPDB — one that allowed it to probe genetic inheritance long before gene sequencing was widely available — was the linking of diverse data sources with family pedigrees that were originally supplied by the Utah Genealogical Society and that were later updated by probabilistic matching with vital records from the Utah Department of Health (records such as birth, death and marriage certificates).

[1]  M H Skolnick,et al.  Chromosome 17q linkage studies of 18 Utah breast cancer kindreds. , 1993, American journal of human genetics.

[2]  D. Khanna,et al.  Heritability of vasculopathy, autoimmune disease, and fibrosis in systemic sclerosis: a population-based study. , 2010, Arthritis and rheumatism.

[3]  Ningli Wang,et al.  Using the Utah Population Database to assess familial risk of primary open angle glaucoma , 2010, Vision Research.

[4]  R. Cawthon,et al.  A Genome-Wide Study Replicates Linkage of 3p22-24 to Extreme Longevity in Humans and Identifies Possible Additional Loci , 2012, PloS one.

[5]  J. Rommens,et al.  The complete BRCA2 gene and mutations in chromosome 13q-linked kindreds , 1996, Nature Genetics.

[6]  L. Cannon-Albright,et al.  Evidence of an Inherited Predisposition for Cervical Spondylotic Myelopathy , 2012, Spine.

[7]  L. Cannon-Albright,et al.  Identification of Six Loci Associated With Pelvic Organ Prolapse Using Genome-Wide Association Analysis , 2011, Obstetrics and gynecology.

[8]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[9]  Scott L. DuVall,et al.  Evaluation of record linkage between a large healthcare provider and the Utah Population Database , 2012, J. Am. Medical Informatics Assoc..

[10]  Scott L. DuVall,et al.  The Impact of a Growing Minority Population on Identification of Duplicate Records in an Enterprise Data Warehouse , 2010, MedInfo.

[11]  Marc S. Williams,et al.  Inflammatory bowel disease aggregation in Utah kindreds , 2011, Inflammatory bowel diseases.

[12]  M. King,et al.  Haplotype and phenotype analysis of nine recurrent BRCA2 mutations in 111 families: results of an international study. , 1998, American journal of human genetics.

[13]  M. Skolnick,et al.  A large kindred with 17q-linked breast and ovarian cancer: genetic, phenotypic, and genealogical analysis. , 1994, Journal of the National Cancer Institute.