Stabilized sparse ordinal regression for medical risk stratification

The recent wide adoption of electronic medical records (EMRs) presents great opportunities and challenges for data mining. The EMR data are largely temporal, often noisy, irregular and high dimensional. This paper constructs a novel ordinal regression framework for predicting medical risk stratification from EMR. First, a conceptual view of EMR as a temporal image is constructed to extract a diverse set of features. Second, ordinal modeling is applied for predicting cumulative or progressive risk. The challenges are building a transparent predictive model that works with a large number of weakly predictive features, and at the same time, is stable against resampling variations. Our solution employs sparsity methods that are stabilized through domain-specific feature interaction networks. We introduces two indices that measure the model stability against data resampling. Feature networks are used to generate two multivariate Gaussian priors with sparse precision matrices (the Laplacian and Random Walk). We apply the framework on a large short-term suicide risk prediction problem and demonstrate that our methods outperform clinicians to a large margin, discover suicide risk factors that conform with mental health knowledge, and produce models with enhanced stability.

[1]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[2]  Carles Martin-Fumadó,et al.  Clinical and epidemiological aspects of suicide in patients with schizophrenia. , 2012, Actas espanolas de psiquiatria.

[3]  Ludmila I. Kuncheva,et al.  A stability index for feature selection , 2007, Artificial Intelligence and Applications.

[4]  Chris H. Q. Ding,et al.  Toward structural sparsity: an explicit $$\ell _{2}/\ell _0$$ approach , 2013, 2010 IEEE International Conference on Data Mining.

[5]  J. Pearson,et al.  Contact with mental health and primary care providers before suicide: a review of the evidence. , 2002, The American journal of psychiatry.

[6]  Fei Wang,et al.  Towards heterogeneous temporal clinical event pattern discovery: a convolutional approach , 2012, KDD.

[7]  Peter C Austin,et al.  Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. , 2004, Journal of clinical epidemiology.

[8]  Junzhou Huang,et al.  Learning with structured sparsity , 2009, ICML '09.

[9]  Trevor Hastie,et al.  Averaged gene expressions for regression. , 2007, Biostatistics.

[10]  Chris H. Q. Ding,et al.  Stable feature selection via dense feature groups , 2008, KDD.

[11]  David R. Williams,et al.  Twelve-month prevalence of and risk factors for suicide attempts in the World Health Organization World Mental Health Surveys. , 2010, The Journal of clinical psychiatry.

[12]  Jiayu Zhou,et al.  Modeling disease progression via multi-task learning , 2013, NeuroImage.

[13]  Bogdan E. Popescu,et al.  PREDICTIVE LEARNING VIA RULE ENSEMBLES , 2008, 0811.1679.

[14]  T. Poggio,et al.  General conditions for predictivity in learning theory , 2004, Nature.

[15]  A. Pokorny Prediction of suicide in psychiatric patients. Report of a prospective study. , 1983, Archives of general psychiatry.

[16]  Shalom Mendel,et al.  Neural network identification of high-risk suicide patients , 2002, Medical informatics and the Internet in medicine.

[17]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[18]  Jana Novovicová,et al.  Evaluating Stability and Comparing Output of Feature Selectors that Optimize Feature Subset Cardinality , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  A. Leenaars,et al.  Suicide Note Classification Using Natural Language Processing: A Content Analysis , 2010, Biomedical informatics insights.

[20]  John Blitzer,et al.  Regularized Learning with Networks of Features , 2008, NIPS.

[21]  Wei Luo,et al.  An integrated framework for suicide risk prediction , 2013, KDD.

[22]  Fei Wang,et al.  SOR: Scalable Orthogonal Regression for Low-Redundancy Feature Selection and its Healthcare Applications , 2012, SDM.

[23]  Matthew K Nock,et al.  Prevalence, correlates, and treatment of lifetime suicidal behavior among adolescents: results from the National Comorbidity Survey Replication Adolescent Supplement. , 2013, JAMA psychiatry.

[24]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[25]  Lei Yu,et al.  Stable and Accurate Feature Selection , 2009, ECML/PKDD.

[26]  Ludwig Lausser,et al.  Measuring and visualizing the stability of biomarker selection techniques , 2013, Comput. Stat..

[27]  Keith Hawton,et al.  Living alone and deliberate self-harm: a case–control study of characteristics and risk factors , 2011, Social Psychiatry and Psychiatric Epidemiology.

[28]  Matthew Large,et al.  The Validity and Utility of Risk Assessment for Inpatient Suicide , 2011, Australasian psychiatry : bulletin of Royal Australian and New Zealand College of Psychiatrists.

[29]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[30]  Justin Zobel,et al.  Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context , 2010, BMC Bioinformatics.

[31]  P. McCullagh Regression Models for Ordinal Data , 1980 .

[32]  Jinbo Bi,et al.  Dimensionality Reduction via Sparse Support Vector Machines , 2003, J. Mach. Learn. Res..

[33]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[34]  Antonio Artés-Rodríguez,et al.  Improving the accuracy of suicide attempter classification , 2011, Artif. Intell. Medicine.

[35]  David Delgado-Gomez,et al.  Combining scales to assess suicide risk. , 2012, Journal of psychiatric research.

[36]  Edwin D Boudreaux,et al.  Screening for suicidal ideation and attempts among emergency department medical patients: instrument and results from the Psychiatric Emergency Research Collaboration. , 2013, Suicide & life-threatening behavior.

[37]  Jaime S. Cardoso,et al.  Learning to Classify Ordinal Data: The Data Replication Method , 2007, J. Mach. Learn. Res..

[38]  Jieping Ye,et al.  Sparse methods for biomedical data , 2012, SKDD.

[39]  Ling Li,et al.  Ordinal Regression by Extended Binary Classification , 2006, NIPS.

[40]  LiJiuyong,et al.  Kernel Discriminant Learning for Ordinal Regression , 2010 .

[41]  Fernando Pérez-Cruz,et al.  Bayesian Nonparametric Modeling of Suicide Attempts , 2012, NIPS.

[42]  R A Steer,et al.  Risk factors for suicide in psychiatric outpatients: a 20-year prospective study. , 2000, Journal of consulting and clinical psychology.

[43]  Svetha Venkatesh,et al.  A Sequential Decision Approach to Ordinal Preferences in Recommender Systems , 2012, AAAI.

[44]  Hongzhe Li,et al.  In Response to Comment on "Network-constrained regularization and variable selection for analysis of genomic data" , 2008, Bioinform..

[45]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[46]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[47]  Wei Chu,et al.  Support Vector Ordinal Regression , 2007, Neural Computation.

[48]  Franco Montebovi,et al.  Suicidal behavior in bipolar disorder: epidemiology, characteristics and major risk factors. , 2012, Journal of affective disorders.

[49]  Charlotte Soneson,et al.  A framework for list representation, enabling list stabilization through incorporation of gene exchangeabilities. , 2011, Biostatistics.

[50]  N Kapur,et al.  Hospitalization for physical illness and risk of subsequent suicide: a population study , 2013, Journal of internal medicine.

[51]  G. Bedogni,et al.  Clinical Prediction Models—a Practical Approach to Development, Validation and Updating , 2009 .

[52]  Shie Mannor,et al.  Sparse algorithms are not stable: A no-free-lunch theorem , 2008, Allerton 2008.

[53]  R. Bender,et al.  Ordinal Logistic Regression in Medical Research , 1997, Journal of the Royal College of Physicians of London.

[54]  C. Steiner,et al.  Comorbidity measures for use with administrative data. , 1998, Medical care.

[55]  Hongliang Fei,et al.  Regularization and feature selection for networked features , 2010, CIKM '10.

[56]  Daniel Hernández-Lobato,et al.  Network-based sparse Bayesian classification , 2011, Pattern Recognit..

[57]  Matthew Large,et al.  Suicide is preventable but not predictable , 2012, Australasian psychiatry : bulletin of Royal Australian and New Zealand College of Psychiatrists.

[58]  L. Appleby,et al.  Emergency department contact prior to suicide in mental health patients , 2010, Emergency Medicine Journal.

[59]  Jessica R. Grisham,et al.  Risk factors for suicide in psychiatric outpatients: a 20-year prospective study. , 2000 .

[60]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[61]  Matthew M Large,et al.  Suicide in Australia: meta‐analysis of rates and methods of suicide between 1988 and 2007 , 2010, The Medical journal of Australia.

[62]  Matthew Large,et al.  Clinical Decisions in Psychiatry Should Not Be Based On Risk Assessment , 2010, Australasian psychiatry : bulletin of Royal Australian and New Zealand College of Psychiatrists.

[63]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[64]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[65]  D. Madigan,et al.  Machine learning and data mining: strategies for hypothesis generation , 2012, Molecular Psychiatry.

[66]  Gerhard Tutz,et al.  Sequential models in categorical regression , 1991 .

[67]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[68]  Glenn A Melvin,et al.  Suicide risk assessment: where are we now? , 2013, The Medical journal of Australia.

[69]  Michael Elad,et al.  Stable recovery of sparse overcomplete representations in the presence of noise , 2006, IEEE Transactions on Information Theory.

[70]  Andrea Esuli,et al.  Evaluation Measures for Ordinal Regression , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[71]  Melanie Hilario,et al.  Knowledge and Information Systems , 2007 .

[72]  E. Steyerberg Clinical Prediction Models , 2008, Statistics for Biology and Health.

[73]  Robert Tibshirani,et al.  Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[74]  Shay B. Cohen,et al.  Advances in Neural Information Processing Systems 25 , 2012, NIPS 2012.