Finding Cervical Cancer Symptoms in Swedish Clinical Text using a Machine Learning Approach and NegEx

Detection of early symptoms in cervical cancer is crucial for early treatment and survival. To find symptoms of cervical cancer in clinical text, Named Entity Recognition is needed. In this paper the Clinical Entity Finder, a machine-learning tool trained on annotated clinical text from a Swedish internal medicine emergency unit, is evaluated on cervical cancer records. The Clinical Entity Finder identifies entities of the types body part, finding and disorder and is extended with negation detection using the rule-based tool NegEx, to distinguish between negated and non-negated entities. To measure the performance of the tools on this new domain, two physicians annotated a set of clinical notes from the health records of cervical cancer patients. The inter-annotator agreement for finding, disorder and body part obtained an average F-score of 0.677 and the Clinical Entity Finder extended with NegEx had an average F-score of 0.667.

[1]  David Martinez,et al.  Stability of Text Mining Techniques for Identifying Cancer Staging , 2013 .

[2]  Maria Kvist,et al.  Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: An annotation and machine learning study , 2014, J. Biomed. Informatics.

[3]  P. Phinjaroenphan,et al.  Automated prognostic tool for cervical cancer patient database , 2004, International Conference on Intelligent Sensing and Information Processing, 2004. Proceedings of.

[4]  T. Werge,et al.  Dose-Specific Adverse Drug Reaction Identification in Electronic Patient Records: Temporal Data Mining in an Inpatient Psychiatric Population , 2014, Drug Safety.

[5]  Maria Kvist,et al.  Negation Scope Delimitation in Clinical Text Using Three Approaches: NegEx, PyConTextNLP and SynNeg , 2013, NODALIDA.

[6]  Maria Skeppstedt,et al.  Negation detection in Swedish clinical text: An adaption of NegEx to Swedish , 2011, J. Biomed. Semant..

[7]  János Csirik,et al.  The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes , 2008, BMC Bioinformatics.

[8]  James W. Cooper,et al.  Automatically extracting cancer disease characteristics from pathology reports into a Disease Knowledge Representation Model , 2009, J. Biomed. Informatics.

[9]  Hercules Dalianis,et al.  Exploration of Known and Unknown Early Symptoms of Cervical Cancer and Development of a Symptom Spectrum - Outline of a Data and Text Mining Based Approach , 2015, CAiSE Industry Track.

[10]  Olga Patterson,et al.  Document clustering of clinical narratives: a systematic study of clinical sublanguages. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[11]  Hercules Dalianis,et al.  Stockholm EPR Corpus : A Clinical Database Used to Improve Health Care , 2012 .

[12]  Maria Kvist,et al.  Modeling human comprehension of Swedish medical records for intelligent access and summarization systems - Future vision, a physician's perspective , 2011 .

[13]  Sankar K. Pal,et al.  Staging of cervical cancer with soft computing , 2000, IEEE Transactions on Biomedical Engineering.

[14]  Goran Nenadic,et al.  Text mining of cancer-related information: Review of current status and future directions , 2014, Int. J. Medical Informatics.

[15]  Zellig S. Harris,et al.  Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[16]  Sampo Pyysalo,et al.  brat: a Web-based Tool for NLP-Assisted Text Annotation , 2012, EACL.

[17]  Søren Brunak,et al.  Using Electronic Patient Records to Discover Disease Correlations and Stratify Patient Cohorts , 2011, PLoS Comput. Biol..

[18]  Andrew MacKinlay,et al.  Cross-hospital portability of information extraction of cancer staging information , 2014, Artif. Intell. Medicine.

[19]  K. Sundström Human papillomavirus test and vaccination : impact on cervical cancer screening and prevention , 2012 .

[20]  Kelly Smith,et al.  Professional language in Swedish clinical text : Linguistic characterization and comparative studies , 2014 .

[21]  T. Kessler,et al.  Cervical Cancer: Prevention and Early Detection. , 2017, Seminars in oncology nursing.