A Hybrid Engine for Clinical Information Extraction from Radiology Reports

Clinical researches and practitioners require data extracted from CT scan reports but most of them are in unstructured data format, which are not ready to analysis. Furthermore, a lag of annotated data makes data extraction more difficult to apply natural language processing techniques to convert unstructured data to be structured data. This study is therefore conducted to apply an automated engine employing topic modeling combined with lexicon and syntactic rule-based approach to extract clinical information from CT scan reports. This prototype shows promising results for constructing clinical datasets for further clinical researches.

[1]  S. Swensen,et al.  The probability of malignancy in solitary pulmonary nodules. Application to small radiologically indeterminate nodules. , 1997, Archives of internal medicine.

[2]  Jun Wang,et al.  Development and validation of a clinical prediction model to estimate the probability of malignancy in solitary pulmonary nodules in Chinese people. , 2011, Clinical lung cancer.

[3]  Yongqing Guo,et al.  Novel and Convenient Method to Evaluate the Character of Solitary Pulmonary Nodule-Comparison of Three Mathematical Prediction Models and Further Stratification of Risk Factors , 2013, PloS one.

[4]  Ruey-Cheng Chen,et al.  An Adaptation of Topic Modeling to Sentences , 2016, ArXiv.

[5]  S. Saleem,et al.  Radiology Education in the Faculty of Medicine at Cairo University (Kasr Al-Ainy Hospital) , 2009 .

[6]  D. Sugarbaker,et al.  Relationship between a history of antecedent cancer and the probability of malignancy for a solitary pulmonary nodule. , 2004, Chest.

[7]  D. Naidich,et al.  Evaluation of individuals with pulmonary nodules: when is it lung cancer? Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. , 2013, Chest.

[8]  M. Gould,et al.  A clinical model to estimate the pretest probability of lung cancer in patients with solitary pulmonary nodules. , 2007, Chest.

[9]  O. Kwon,et al.  Predictors for Benign Solitary Pulmonary Nodule in Tuberculosis-Endemic Area , 2001, The Korean journal of internal medicine.

[10]  Y. Nakajima,et al.  Radiologist supply and workload: international comparison , 2008, Radiation Medicine.

[11]  [Establishment of A Clinical Prediction Model of Solid Solitary Pulmonary Nodules]. , 2016, Zhongguo fei ai za zhi = Chinese journal of lung cancer.

[12]  Xuequan Huang,et al.  Assessment of the cancer risk factors of solitary pulmonary nodules , 2017, Oncotarget.

[13]  Scott R. Halgrim,et al.  An Automated Method for Identifying Individuals with a Lung Nodule Can Be Feasibly Implemented Across Health Systems , 2016, EGEMS.

[14]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[15]  A. Dallı,et al.  Diagnostic value of PET/CT in differentiating benign from malignant solitary pulmonary nodules. , 2013, Journal of B.U.ON. : official journal of the Balkan Union of Oncology.

[16]  S J Swensen,et al.  Solitary pulmonary nodules: clinical prediction model versus physicians. , 1999, Mayo Clinic proceedings.

[17]  Jun Wang,et al.  [Establishment of a mathematical prediction model to evaluate the probability of malignancy or benign in patients with solitary pulmonary nodules]. , 2011, Beijing da xue xue bao. Yi xue ban = Journal of Peking University. Health sciences.

[18]  S. Lam,et al.  Probability of cancer in pulmonary nodules detected on first screening CT. , 2013, The New England journal of medicine.