Distilled Wasserstein Learning for Word Embedding and Topic Modeling
暂无分享,去创建一个
Wei Liu | Lawrence Carin | Hongteng Xu | Wenlin Wang | L. Carin | W. Liu | H. Xu | Wenlin Wang | Hongteng Xu
[1] Rajarshi Das,et al. Gaussian LDA for Topic Models with Word Embeddings , 2015, ACL.
[2] Sven Laur,et al. Linear Ensembles of Word Embedding Models , 2017, NODALIDA.
[3] Thorsten Joachims,et al. Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.
[4] Aram Galstyan,et al. Multitask learning and benchmarking with clinical time series data , 2017, Scientific Data.
[5] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.
[6] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[7] Guoyin Wang,et al. Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms , 2018, ACL.
[8] Jean-Luc Starck,et al. Wasserstein Dictionary Learning: Optimal Transport-based unsupervised non-linear dictionary learning , 2017, SIAM J. Imaging Sci..
[9] Jinmiao Huang,et al. An Empirical Evaluation of Deep Learning for ICD-9 Code Assignment using MIMIC-III Clinical Notes , 2018, Comput. Methods Programs Biomed..
[10] Jason Altschuler,et al. Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration , 2017, NIPS.
[11] Matt J. Kusner,et al. From Word Embeddings To Document Distances , 2015, ICML.
[12] Geoffrey E. Hinton,et al. Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.
[13] Gabriel Peyré,et al. Sinkhorn-AutoDiff: Tractable Wasserstein Learning of Generative Models , 2017 .
[14] Gabriel Peyré,et al. Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..
[15] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.
[16] Matt J. Kusner,et al. Supervised Word Mover's Distance , 2016, NIPS.
[17] James Zijun Wang,et al. Fast Discrete Distribution Clustering Using Wasserstein Barycenter With Sparse Support , 2015, IEEE Transactions on Signal Processing.
[18] Guillaume Carlier,et al. Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..
[19] Jimeng Sun,et al. Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.
[20] Bernhard Schölkopf,et al. Unifying distillation and privileged information , 2015, ICLR.
[21] Thomas A. Lasko,et al. Embedding Complexity In the Data Representation Instead of In the Model: A Case Study Using Heterogeneous Medical Data , 2018, 1802.04233.
[22] Razvan Pascanu,et al. Progressive Neural Networks , 2016, ArXiv.
[23] C. Villani. Optimal Transport: Old and New , 2008 .
[24] Steven Schockaert,et al. Jointly Learning Word Embeddings and Latent Topics , 2017, SIGIR.
[25] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .
[26] Jitendra Malik,et al. Cross Modal Distillation for Supervision Transfer , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Peter Szolovits,et al. MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.
[28] Martial Hebert,et al. Learning to Learn: Model Regression Networks for Easy Small Sample Learning , 2016, ECCV.
[29] Mikhail Belkin,et al. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.
[30] Jimeng Sun,et al. Multi-layer Representation Learning for Medical Concepts , 2016, KDD.
[31] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[32] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[33] Yan Liu,et al. Distilling Knowledge from Deep Networks with Applications to Healthcare Domain , 2015, ArXiv.
[34] Nicolas Courty,et al. Learning Wasserstein Embeddings , 2017, ICLR.
[35] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.
[36] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[37] Zhe Gan,et al. Topic Compositional Neural Language Model , 2017, AISTATS.
[38] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[39] Thibaut Le Gouic,et al. Distribution's template estimate with Wasserstein metrics , 2011, 1111.5927.
[40] Arnaud Doucet,et al. Fast Computation of Wasserstein Barycenters , 2013, ICML.
[41] Zhiyuan Liu,et al. Topical Word Embeddings , 2015, AAAI.
[42] Marco Cuturi,et al. Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.
[43] Noémie Elhadad,et al. Multi-Label Classification of Patient Notes: Case Study on ICD Code Assignment , 2018, AAAI Workshops.
[44] Victor M. Panaretos,et al. Fréchet means and Procrustes analysis in Wasserstein space , 2017, Bernoulli.
[45] Pengtao Xie,et al. A Neural Architecture for Automated ICD Coding , 2017, ACL.