Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs

We present simple and computationally efficient nonparametric estimators of Renyi entropy and mutual information based on an i.i.d. sample drawn from an unknown, absolutely continuous distribution over ℝd. The estimators are calculated as the sum of p-th powers of the Euclidean lengths of the edges of the 'generalized nearest-neighbor' graph of the sample and the empirical copula of the sample respectively. For the first time, we prove the almost sure consistency of these estimators and upper bounds on their rates of convergence, the latter of which under the assumption that the density underlying the sample is Lipschitz continuous. Experiments demonstrate their usefulness in independent subspace analysis.

[1]  G. Peano Sur une courbe, qui remplit toute une aire plane , 1890 .

[2]  D. Hilbert Ueber die stetige Abbildung einer Line auf ein Flächenstück , 1891 .

[3]  D. Hilbert Über die stetige Abbildung einer Linie auf ein Flächenstück , 1935 .

[4]  Oldrich A Vasicek,et al.  A Test for Normality Based on Sample Entropy , 1976 .

[5]  D. W. Scott On optimal and data based histograms , 1979 .

[6]  S. Milne Peano curves and smoothness of functions , 1980 .

[7]  J. Steele Probability theory and combinatorial optimization , 1987 .

[8]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[9]  A. Tsybakov,et al.  Root-N consistent estimators of entropy for densities with unbounded support , 1994, Proceedings of 1994 Workshop on Information Theory and Statistics.

[10]  Zanette,et al.  Fractal random walks from a variational formalism for Tsallis entropies. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[11]  M. Talagrand Concentration of measure and isoperimetric inequalities in product spaces , 1994, math/9406212.

[12]  J. Yukich,et al.  Asymptotics for Euclidean functionals with power-weighted edges , 1996 .

[13]  J. Yukich Probability theory of classical Euclidean optimization problems , 1998 .

[14]  Jean-François Cardoso,et al.  Multidimensional independent component analysis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[15]  Alfred O. Hero,et al.  Asymptotic theory of greedy approximations to minimal k-point random graphs , 1999, IEEE Trans. Inf. Theory.

[16]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.

[17]  J. D. Gorman,et al.  Alpha-Divergence for Classification, Indexing and Retrieval (Revised 2) , 2002 .

[18]  Alfred O. Hero,et al.  Applications of entropic spanning graphs , 2002, IEEE Signal Process. Mag..

[19]  A. Hero,et al.  Entropic graphs for manifold learning , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[20]  John W. Fisher,et al.  ICA Using Spacings Estimates of Entropy , 2003, J. Mach. Learn. Res..

[21]  Christoph Adami,et al.  Information theory in molecular biology , 2004, q-bio/0405004.

[22]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Luc Pronzato,et al.  Minimum-entropy estimation in semi-parametric models , 2005, Signal Process..

[24]  Shaogang Gong,et al.  Conditional Mutual Infomation Based Boosting for Facial Expression Recognition , 2005, BMVC.

[25]  S. Gong,et al.  Conditional Mutual Information Based Boosting for Facial Expression Recognition , 2005 .

[26]  M. N. Goria,et al.  A new class of random vector entropy estimators and its applications in testing statistical hypotheses , 2005 .

[27]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Barnabás Póczos,et al.  Independent subspace analysis using geodesic spanning trees , 2005, ICML.

[29]  Yooyoung Koo,et al.  Rates of Convergence of Means of Euclidean Functionals , 2006, math/0609382.

[30]  Jan Kybic Incremental Updating of Nearest Neighbor-Based High-Dimensional Entropy Estimation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[31]  Robert J. Butera,et al.  Real-time adaptive information-theoretic optimization of neurophysiology experiments , 2006, NIPS.

[32]  Barnabás Póczos,et al.  Undercomplete Blind Subspace Deconvolution , 2007, J. Mach. Learn. Res..

[33]  K. Hlavácková-Schindler,et al.  Causality detection based on information-theoretic approaches in time series analysis , 2007 .

[34]  Babak Nadjar Araabi,et al.  A Hierarchical Clustering Based on Mutual Information Maximization , 2007, 2007 IEEE International Conference on Image Processing.

[35]  P. Doukhan,et al.  Weak Dependence: With Examples and Applications , 2007 .

[36]  L. Pronzato,et al.  A class of Rényi information estimators for multidimensional densities , 2008, 0810.5302.

[37]  Marc M. Van Hulle,et al.  Constrained Subspace ICA Based on Mutual Information Optimization Directly , 2008, Neural Computation.

[38]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[39]  Fei-Fei Li,et al.  Exploring Functional Connectivities of the Human Brain using Multivariate Information Analysis , 2009, NIPS.

[40]  Qing Wang,et al.  Divergence Estimation for Multidimensional Densities Via $k$-Nearest-Neighbor Distances , 2009, IEEE Transactions on Information Theory.

[41]  Sanjeev R. Kulkarni,et al.  Universal Estimation of Information Measures for Analog Sources , 2009, Found. Trends Commun. Inf. Theory.

[42]  Barnabás Póczos,et al.  Identification of Recurrent Neural Networks by Bayesian Interrogation Techniques , 2009, J. Mach. Learn. Res..

[43]  Barnabás Póczos,et al.  REGO: Rank-based Estimation of Renyi Information using Euclidean Graph Optimization , 2010, AISTATS.

[44]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.