NLP Automation to Read Radiological Reports to Detect the Stage of Cancer Among Lung Cancer Patients

A common challenge in the healthcare industry today is physicians have access to massive amounts of healthcare data but have little time and no appropriate tools. For instance, the risk prediction model generated by logistic regression could predict the probability of diseases occurrence and thus prioritizing patients’ waiting list for further investigations. However, many medical reports available in current clinical practice system are not yet ready for analysis using either statistics or machine learning as they are in unstructured text format. The complexity of medical information makes the annotation or validation of data very challenging and thus acts as a bottleneck to apply machine learning techniques in medical data. This study is therefore conducted to create such annotations automatically where the computer can read radiological reports for oncologists and mark the staging of lung cancer. This staging information is obtained using the rule-based method implemented using the standards of Tumor Node Metastasis (TNM) staging along with deep learning technology called Long Short Term Memory (LSTM) to extract clinical information from the Computed Tomography (CT) text report. The empirical experiment shows promising results being the accuracy of up to 85%.

[1]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[2]  Guergana K. Savova,et al.  Discerning Tumor Status from Unstructured MRI Reports—Completeness of Information in Existing Reports and Utility of Automated Natural Language Processing , 2009, Journal of Digital Imaging.

[3]  Jürgen Schmidhuber,et al.  Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition , 2005, ICANN.

[4]  Y. Nakajima,et al.  Radiologist supply and workload: international comparison , 2008, Radiation Medicine.

[5]  F. Detterbeck,et al.  The eighth edition TNM stage classification for lung cancer: What does it mean on main street? , 2018, The Journal of thoracic and cardiovascular surgery.

[6]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[7]  S. Saleem,et al.  Radiology Education in the Faculty of Medicine at Cairo University (Kasr Al-Ainy Hospital) , 2009 .

[8]  P. Hogan,et al.  Projected supply of and demand for oncologists and radiation oncologists through 2025: an aging, better-insured population will result in shortage. , 2014, Journal of oncology practice.

[9]  Fang Liu,et al.  Data Processing and Text Mining Technologies on Electronic Medical Records: A Review , 2018, Journal of healthcare engineering.

[10]  Christopher D. Manning,et al.  Learning to Summarize Radiology Findings , 2018, Louhi@EMNLP.