论文信息 - Prediction-Constrained Topic Models for Antidepressant Recommendation

Prediction-Constrained Topic Models for Antidepressant Recommendation

Supervisory signals can help topic models discover low-dimensional data representations that are more interpretable for clinical tasks. We propose a framework for training supervised latent Dirichlet allocation that balances two goals: faithful generative explanations of high-dimensional data and accurate prediction of associated class labels. Existing approaches fail to balance these goals by not properly handling a fundamental asymmetry: the intended task is always predicting labels from data, not data from labels. Our new prediction-constrained objective trains models that predict labels from heldout data well while also producing good generative likelihoods and interpretable topic-word parameters. In a case study on predicting depression medications from electronic health records, we demonstrate improved recommendations compared to previous supervised topic models and high- dimensional logistic regression from words alone.

[1] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[2] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[3] David J. C. MacKay,et al. Choice of Basis for Laplace Approximation , 1998, Machine Learning.

[4] Mark Steyvers,et al. Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[5] Ben Taskar,et al. Expectation Maximization and Posterior Constraints , 2007, NIPS.

[6] David M. Blei,et al. Supervised Topic Models , 2007, NIPS.

[7] Michael I. Jordan,et al. DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification , 2008, NIPS.

[8] Chong Wang,et al. Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Ben Taskar,et al. Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[10] Daniel M. Roy,et al. Complexity of Inference in Latent Dirichlet Allocation , 2011, NIPS.

[11] Eric P. Xing,et al. MedLDA: maximum margin supervised topic models , 2012, J. Mach. Learn. Res..

[12] A Comparison of Dimensionality Reduction Techniques for Unstructured Clinical Text , 2012 .

[13] Matt Taddy,et al. On Estimation and Selection for Topic Models , 2011, AISTATS.

[14] Ning Chen,et al. Gibbs Max-Margin Topic Models with Fast Sampling Algorithms , 2013, ICML.

[15] Michael J. Paul,et al. Discovering Health Topics in Social Media Using Topic Models , 2014, PloS one.

[16] Jun Zhu,et al. Spectral Methods for Supervised Topic Models , 2014, NIPS.

[17] Hedvig Kjellström,et al. How to Supervise Topic Models , 2014, ECCV Workshops.

[18] Ning Chen,et al. Bayesian inference with posterior regularization and applications to infinite latent SVMs , 2012, J. Mach. Learn. Res..

[19] Anna Rumshisky,et al. Unfolding physiological state: mortality modelling in intensive care units , 2014, KDD.

[20] Yelong Shen,et al. End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture , 2015, NIPS.

[21] Finale Doshi-Velez,et al. Prediction-Constrained Training for Semi-Supervised Mixture and Topic Models , 2017, ArXiv.

[22] Yong Ren,et al. Spectral Learning for Supervised Topic Models , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.