Symptom-based patient stratification in mental illness using clinical notes

Mental illnesses are highly heterogeneous with diagnoses based on symptoms that are generally qualitative, subjective, and documented in free text clinical notes rather than as structured data. Moreover, there exists significant variation in symptoms within diagnostic categories as well as substantial overlap in symptoms between diagnostic categories. These factors pose extra challenges for phenotyping patients with mental illness, a task that has proven challenging even for seemingly well characterized diseases. The ability to identify more homogeneous patient groups could both increase our ability to apply a precision medicine approach to psychiatric disorders and enable elucidation of underlying biological mechanism of pathology. We describe a novel approach to deep phenotyping in mental illness in which contextual term extraction is used to identify constellations of symptoms in a cohort of patients diagnosed with schizophrenia and related disorders. We applied topic modeling and dimensionality reduction to identify similar groups of patients and evaluate the resulting clusters through visualization and interrogation of clinically interpretable weighted features. Our findings show that patients diagnosed with schizophrenia may be meaningfully stratified using symptom-based clustering.

[1]  Philip R. O. Payne,et al.  Evidence generating medicine: redefining the research-practice relationship to complete the evidence cycle. , 2013, Medical care.

[2]  Sumithra Velupillai,et al.  Knowledge discovery for Deep Phenotyping serious mental illness from Electronic Mental Health records , 2018, F1000Research.

[3]  J. Suckling,et al.  Functional network dysconnectivity as a biomarker of treatment resistance in schizophrenia , 2017, Schizophrenia Research.

[4]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[5]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[6]  A. Feigl,et al.  The Global Economic Burden of Noncommunicable Diseases , 2012 .

[7]  S. Hyman Can neuroscience be integrated into the DSM-V? , 2007, Nature Reviews Neuroscience.

[8]  O. Andreassen,et al.  Biclustered Independent Component Analysis for Complex Biomarker and Subtype Identification from Structural Magnetic Resonance Images in Schizophrenia , 2017, Front. Psychiatry.

[9]  Anni Coden,et al.  The ConceptMapper Approach to Named Entity Recognition , 2010, LREC.

[10]  Philip E. Bourne,et al.  Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review , 2019, J. Am. Medical Informatics Assoc..

[11]  Mike Conway,et al.  Extracting a stroke phenotype risk factor from Veteran Health Administration clinical reports: an information content analysis , 2016, Journal of Biomedical Semantics.

[12]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[13]  T. Insel,et al.  Brain disorders? Precisely , 2015, Science.

[14]  Diagnosing the DSM: Diagnostic Classification Needs Fundamental Reform. , 2011 .

[15]  I. Kohane,et al.  A clinical perspective on the relevance of research domain criteria in electronic health records. , 2015, The American journal of psychiatry.

[16]  Chen Lin,et al.  Automatic identification of methotrexate-induced liver toxicity in patients with rheumatoid arthritis from the electronic medical record , 2015, J. Am. Medical Informatics Assoc..

[17]  Mark A. Musen,et al.  The Open Biomedical Annotator , 2009, Summit on translational bioinformatics.

[18]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[19]  P. DeRosse,et al.  Further neuroimaging evidence for the deficit subtype of schizophrenia: a cortical connectomics analysis. , 2015, JAMA psychiatry.

[20]  Michael Berk,et al.  The new field of ‘precision psychiatry’ , 2017, BMC Medicine.

[21]  Peter Szolovits,et al.  Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources , 2015, J. Am. Medical Informatics Assoc..

[22]  Donavan T. Cheng,et al.  Mutational Landscape of Metastatic Cancer Revealed from Prospective Clinical Sequencing of 10,000 Patients , 2017, Nature Medicine.

[23]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[24]  Fredric M. Wolf,et al.  Diagnoses, Syndromes, and Diseases: A Knowledge Representation Problem , 2003, AMIA.

[25]  Cédrick Fairon,et al.  Annotation analysis for testing drug safety signals using unstructured clinical notes , 2012, J. Biomed. Semant..

[26]  I. Kohane,et al.  Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts , 2015, PloS one.

[27]  Janet B W Williams,et al.  Diagnostic and Statistical Manual of Mental Disorders , 2013 .

[28]  T. Insel,et al.  Wesleyan University From the SelectedWorks of Charles A . Sanislow , Ph . D . 2010 Research Domain Criteria ( RDoC ) : Toward a New Classification Framework for Research on Mental Disorders , 2018 .

[29]  R. Altman,et al.  Identifying phenotypic signatures of neuropsychiatric disorders from electronic medical records. , 2013, Journal of the American Medical Informatics Association : JAMIA.

[30]  R. Dobson,et al.  Natural language processing to extract symptoms of severe mental illness from clinical text: the Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project , 2017, BMJ Open.

[31]  Scott R. Halgrim,et al.  Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence. , 2014, American journal of epidemiology.

[32]  Elvira Bramon,et al.  The common genetic liability between schizophrenia and bipolar disorder: A review. , 2001, Current psychiatry reports.

[33]  I. Kullo,et al.  Precision Cardiovascular Medicine: State of Genetic Testing , 2017, Mayo Clinic proceedings.

[34]  D. Blumenthal,et al.  Achieving a Nationwide Learning Health System , 2010, Science Translational Medicine.

[35]  Zhen Hu,et al.  Automated Detection of Postoperative Surgical Site Infections Using Supervised Methods with Electronic Health Record Data , 2017, MedInfo.