Efficient Estimation of Mutual Information for Strongly Dependent Variables

We demonstrate that a popular class of nonparametric mutual information (MI) estimators based on k-nearest-neighbor graphs requires number of samples that scales exponentially with the true MI. Consequently, accurate estimation of MI between two strongly dependent variables is possible only for prohibitively large sample size. This important yet overlooked shortcoming of the existing estimators is due to their implicit reliance on local uniformity of the underlying joint distribution. We introduce a new estimator that is robust to local non-uniformity, works well with limited data, and is able to capture relationship strengths over many orders of magnitude. We demonstrate the superior performance of the proposed estimator on both synthetic and real-world data.

[1]  Michael Satosi Watanabe,et al.  Information Theoretical Analysis of Multivariate Correlation , 1960, IBM J. Res. Dev..

[2]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[3]  J. Yukich Probability theory of classical Euclidean optimization problems , 1998 .

[4]  M. Studený,et al.  The Multiinformation Function as a Tool for Measuring Stochastic Dependence , 1998, Learning in Graphical Models.

[5]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[6]  Harshinder Singh,et al.  Nearest Neighbor Estimates of Entropy , 2003 .

[7]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2022 .

[8]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  S. Saigal,et al.  Relative performance of mutual information estimation methods for quantifying the dependence among short and noisy data. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Fernando Pérez-Cruz,et al.  Estimation of Information Theoretic Measures for Continuous Random Variables , 2008, NIPS.

[11]  Takafumi Kanamori,et al.  Approximating Mutual Information by Maximum Likelihood Density Ratio Estimation , 2008, FSDM.

[12]  Maria L. Rizzo,et al.  Brownian distance covariance , 2009, 1010.0297.

[13]  Yan Li,et al.  Estimation of Mutual Information: A Survey , 2009, RSKT.

[14]  Qing Wang,et al.  Divergence Estimation for Multidimensional Densities Via $k$-Nearest-Neighbor Distances , 2009, IEEE Transactions on Information Theory.

[15]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[16]  Barnabás Póczos,et al.  Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs , 2010, NIPS.

[17]  Amaury Lendasse,et al.  A boundary corrected expansion of the moments of nearest neighbor distributions , 2010, Random Struct. Algorithms.

[18]  Frank Nielsen,et al.  Entropies and cross-entropies of exponential families , 2010, 2010 IEEE International Conference on Image Processing.

[19]  A. Lendasse,et al.  A boundary corrected expansion of the moments of nearest neighbor distributions , 2010 .

[20]  A. R. Alizad Rahvar,et al.  Boundary effect correction in k-nearest-neighbor estimation. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[22]  Sebastian Nowozin,et al.  Information Theoretic Clustering Using Minimum Spanning Trees , 2012, DAGM/OAGM Symposium.

[23]  Malka Gorfine Comment on “ Detecting Novel Associations in Large Data Sets ” , 2012 .

[24]  Alfred O. Hero,et al.  Estimation of Nonlinear Functionals of Densities With Confidence , 2012, IEEE Transactions on Information Theory.

[25]  Ulrike von Luxburg,et al.  Density estimation from unweighted k-nearest neighbor graphs: a roadmap , 2013, NIPS.

[26]  Alfred O. Hero,et al.  Ensemble Estimators for Multivariate Entropy Estimation , 2013, IEEE Transactions on Information Theory.

[27]  R. Heller,et al.  A consistent multivariate test of association based on ranks of distances , 2012, 1201.3522.

[28]  Barnabás Póczos,et al.  Generalized Exponential Concentration Inequality for Renyi Divergence Estimation , 2014, ICML.

[29]  R. Tibshirani,et al.  Comment on "Detecting Novel Associations In Large Data Sets" by Reshef Et Al, Science Dec 16, 2011 , 2014, 1401.7645.

[30]  Zoltán Szabó,et al.  Information theoretical estimators toolbox , 2014, J. Mach. Learn. Res..

[31]  Alfred O. Hero,et al.  Ensemble estimation of multivariate f-divergence , 2014, 2014 IEEE International Symposium on Information Theory.

[32]  J. Kinney,et al.  Equitability, mutual information, and the maximal information coefficient , 2013, Proceedings of the National Academy of Sciences.

[33]  Aram Galstyan,et al.  Efficient Estimation of Mutual Information for Strongly Dependent Variables , 2014, AISTATS.