A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

Clinical case reports (CCRs) are a valuable means of sharing observations and insights in medicine. The form of these documents varies, and their content includes descriptions of numerous, novel disease presentations and treatments. Thus far, the text data within CCRs is largely unstructured, requiring significant human and computational effort to render these data useful for in-depth analysis. In this protocol, we describe methods for identifying metadata corresponding to specific biomedical concepts frequently observed within CCRs. We provide a metadata template as a guide for document annotation, recognizing that imposing structure on CCRs may be pursued by combinations of manual and automated effort. The approach presented here is appropriate for organization of concept-related text from a large literature corpus (e.g., thousands of CCRs) but may be easily adapted to facilitate more focused tasks or small sets of reports. The resulting structured text data includes sufficient semantic context to support a variety of subsequent text analysis workflows: meta-analyses to determine how to maximize CCR detail, epidemiological studies of rare diseases, and the development of models of medical language may all be made more realizable and manageable through the use of structured text data.

[1]  Hongfang Liu,et al.  CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines , 2017, J. Am. Medical Informatics Assoc..

[2]  E. Mohammadi,et al.  Barriers and facilitators related to the implementation of a physiological track and trigger system: A systematic review of the qualitative evidence , 2017, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[3]  J. Pearce Louis Pasteur and Rabies: a brief note , 2002, Journal of neurology, neurosurgery, and psychiatry.

[4]  Thomas A. Ban,et al.  The role of serendipity in drug discovery , 2006, Dialogues in clinical neuroscience.

[5]  Huan Xu,et al.  Human Infection with Burkholderia thailandensis, China, 2013 , 2017, Emerging infectious diseases.

[6]  A. Caban-Martinez,et al.  Advancing medicine one research note at a time: the educational value in clinical case reports , 2012, BMC Research Notes.

[7]  Katherine G Akers,et al.  New journals for publishing medical case reports. , 2016, Journal of the Medical Library Association : JMLA.

[8]  Ahmed M Bayoumi,et al.  The storied case report , 2004, Canadian Medical Association Journal.

[9]  J. Vandenbroucke In Defense of Case Reports and Case Series , 2001, Annals of Internal Medicine.

[10]  Anna Rumshisky,et al.  Evaluating temporal relations in clinical text: 2012 i2b2 Challenge , 2013, J. Am. Medical Informatics Assoc..

[11]  K. Bretonnel Cohen,et al.  Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles , 2017, BMC Bioinformatics.

[12]  H. Mccubbin,et al.  Brief note , 2005, Journal of Primary Prevention.

[13]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[14]  Stanley M. Huff,et al.  Application of information technology: Development of an electronic public health case report using HL7 v2.5 to meet public health needs , 2010, J. Am. Medical Informatics Assoc..

[15]  J. D. Cameron,et al.  Ocular histopathology of acrodermatitis enteropathica. , 1986, The British journal of ophthalmology.

[16]  David Moher,et al.  CARE guidelines for case reports: explanation and elaboration document. , 2017, Journal of clinical epidemiology.

[17]  G W Beeler,et al.  HL7 version 3--an object-oriented methodology for collaborative standards development. , 1998, International journal of medical informatics.

[18]  Kamran Sartipi,et al.  HL7 FHIR: An Agile and RESTful approach to healthcare information exchange , 2013, Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems.

[19]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[20]  Abhishek Pandey,et al.  Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review , 2017, J. Biomed. Informatics.

[21]  Hall Is Penicillin in the treatment of infections of the nose and sinuses. , 1949 .

[22]  L. Biesecker,et al.  Mapping phenotypes to language: a proposal to organize and standardize the clinical descriptions of malformations , 2005, Clinical genetics.

[23]  C. McDonald,et al.  LOINC, a universal standard for identifying laboratory observations: a 5-year update. , 2003, Clinical chemistry.