Estimating Mutual Information Between Dense Word Embeddings
暂无分享,去创建一个
Vitalii Zhelezniak | Nils Y. Hammerla | Aleksandar Savkov | Nils Hammerla | N. Hammerla | Aleksandar Savkov | V. Zhelezniak
[1] Mirella Lapata,et al. Vector-based Models of Semantic Composition , 2008, ACL.
[2] Rui Zhao,et al. Fuzzy Bag-of-Words Model for Document Representation , 2018, IEEE Transactions on Fuzzy Systems.
[3] Igor Vajda,et al. Estimation of the Information by an Adaptive Partitioning of the Observation Space , 1999, IEEE Trans. Inf. Theory.
[4] Vitalii Zhelezniak,et al. Correlations between Word Vector Sets , 2019, EMNLP.
[5] Nan Hua,et al. Universal Sentence Encoder for English , 2018, EMNLP.
[6] Michael Mitzenmacher,et al. Detecting Novel Associations in Large Data Sets , 2011, Science.
[7] S. Saigal,et al. Relative performance of mutual information estimation methods for quantifying the dependence among short and noisy data. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.
[8] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[9] Chenguang Zhu,et al. Parameter-free Sentence Embedding via Orthogonal Basis , 2019, EMNLP/IJCNLP.
[10] Claire Cardie,et al. SemEval-2014 Task 10: Multilingual Semantic Textual Similarity , 2014, *SEMEVAL.
[11] Marwan Torki,et al. A Document Descriptor using Covariance of Word Vectors , 2018, ACL.
[12] Iryna Gurevych,et al. Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations , 2018, 1803.01400.
[13] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[14] Kevin Gimpel,et al. Towards Universal Paraphrastic Sentence Embeddings , 2015, ICLR.
[15] Brian C. Ross. Mutual Information between Discrete and Continuous Data Sets , 2014, PloS one.
[16] Lior Wolf,et al. In Defense of Word Embedding for Generic Text Representation , 2015, NLDB.
[17] Fabio A. González,et al. Text Comparison Using Soft Cardinality , 2010, SPIRE.
[18] Matt J. Kusner,et al. From Word Embeddings To Document Distances , 2015, ICML.
[19] Tomas Mikolov,et al. Bag of Tricks for Efficient Text Classification , 2016, EACL.
[20] Guillaume A. Rousselet,et al. A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula , 2016, bioRxiv.
[21] Eneko Agirre,et al. *SEM 2013 shared task: Semantic Textual Similarity , 2013, *SEMEVAL.
[22] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[23] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[24] Takafumi Kanamori,et al. Approximating Mutual Information by Maximum Likelihood Density Ratio Estimation , 2008, FSDM.
[25] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[26] A. Kraskov,et al. Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.
[27] Christopher Joseph Pal,et al. Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning , 2018, ICLR.
[28] Sanjeev Arora,et al. A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.
[29] Felix Hill,et al. Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.
[30] Matt J. Kusner,et al. Supervised Word Mover's Distance , 2016, NIPS.
[31] Aram Galstyan,et al. Efficient Estimation of Mutual Information for Strongly Dependent Variables , 2014, AISTATS.
[32] Kevin Gimpel,et al. From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.
[33] Fraser,et al. Independent coordinates for strange attractors from mutual information. , 1986, Physical review. A, General physics.
[34] Yannis Stavrakas,et al. Multivariate Gaussian Document Representation from Word Embeddings for Text Categorization , 2017, EACL.
[35] Vitalii Zhelezniak,et al. Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors , 2019, ICLR.
[36] G. V. Steeg. Non-parametric Entropy Estimation Toolbox (NPEET) , 2014 .
[37] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[38] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[39] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[40] Moon,et al. Estimation of mutual information using kernel density estimators. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[41] Carsten O. Daub,et al. The mutual information: Detecting and evaluating dependencies between variables , 2002, ECCB.
[42] Eneko Agirre,et al. SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity , 2012, *SEMEVAL.
[43] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[44] Thomas Demeester,et al. Representation learning for very short texts using weighted word embedding aggregation , 2016, Pattern Recognit. Lett..
[45] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[46] B. Efron. Better Bootstrap Confidence Intervals , 1987 .
[47] Aram Galstyan,et al. Information-theoretic measures of influence based on content dynamics , 2012, WSDM.
[48] Kevin Gimpel,et al. Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations , 2017, ArXiv.
[49] Pramod Viswanath,et al. Demystifying fixed k-nearest neighbor information estimators , 2016, 2017 IEEE International Symposium on Information Theory (ISIT).
[50] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.
[51] Christian S. Perone,et al. Evaluation of sentence embeddings in downstream and linguistic probing tasks , 2018, ArXiv.
[52] Geoffrey E. Hinton,et al. Similarity of Neural Network Representations Revisited , 2019, ICML.
[53] Alexander A. Alemi,et al. On Variational Bounds of Mutual Information , 2019, ICML.
[54] R. Moddemeijer. On estimation of entropy and mutual information of continuous distributions , 1989 .
[55] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[56] Bernhard Schölkopf,et al. Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.
[57] Mehryar Mohri,et al. Algorithms for Learning Kernels Based on Centered Alignment , 2012, J. Mach. Learn. Res..
[58] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[59] Douwe Kiela,et al. SentEval: An Evaluation Toolkit for Universal Sentence Representations , 2018, LREC.
[60] Eneko Agirre,et al. SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation , 2016, *SEMEVAL.
[61] Vitalii Zhelezniak,et al. Correlation Coefficients and Semantic Textual Similarity , 2019, NAACL.
[62] Yoshua Bengio,et al. Mutual Information Neural Estimation , 2018, ICML.
[63] Fabio A. González,et al. Soft Cardinality in Semantic Text Processing: Experience of the SemEval International Competitions , 2015, Polytech. Open Libr. Int. Bull. Inf. Technol. Sci..