Information measures and geometry of the hyperbolic exponential families of Poincar\'e and hyperboloid distributions

We study various information-theoretic measures and the information geometry of the Poincar\'e distributions and the related hyperboloid distributions, and prove that their statistical mixture models are universal density estimators of smooth densities in hyperbolic spaces. The Poincar\'e and the hyperboloid distributions are two types of hyperbolic probability distributions defined using different models of hyperbolic geometry. Namely, the Poincar\'e distributions form a triparametric bivariate exponential family whose sample space is the hyperbolic Poincar\'e upper-half plane and natural parameter space is the open 3D convex cone of two-by-two positive-definite matrices. The family of hyperboloid distributions form another exponential family which has sample space the forward sheet of the two-sheeted unit hyperboloid modeling hyperbolic geometry. In the first part, we prove that all $f$-divergences between Poincar\'e distributions can be expressed using three canonical terms using Eaton's framework of maximal group invariance. We also show that the $f$-divergences between any two Poincar\'e distributions are asymmetric except when those distributions belong to a same leaf of a particular foliation of the parameter space. We report closed-form formula for the Fisher information matrix, the Shannon's differential entropy and the Kullback-Leibler divergence. and Bhattacharyya distances between such distributions using the framework of exponential families. In the second part, we state the corresponding results for the exponential family of hyperboloid distributions by highlighting a parameter correspondence between the Poincar\'e and the hyperboloid distributions. Finally, we describe a random generator to draw variates and present two Monte Carlo methods to stochastically estimate numerically $f$-divergences between hyperbolic distributions.

[1]  S. Verdú The Cauchy Distribution in Information Theory , 2023, Entropy.

[2]  L. Jing,et al.  A Preliminary Exploration of Extractive Multi-Document Summarization in Hyperbolic Space , 2022, CIKM.

[3]  E. Magli,et al.  Rethinking the compositionality of point clouds through regularization in the hyperbolic space , 2022, NeurIPS.

[4]  Md. Shad Akhtar,et al.  Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social-Text Classification , 2022, NeurIPS.

[5]  Juyong Lee,et al.  A Rotated Hyperbolic Wrapped Normal Distribution for Hierarchical Representation Learning , 2022, NeurIPS.

[6]  F. Nielsen The Many Faces of Information Geometry , 2022, Notices of the American Mathematical Society.

[7]  F. Nielsen,et al.  On f-Divergences Between Cauchy Distributions , 2021, IEEE Transactions on Information Theory.

[8]  Carl Vondrick,et al.  Learning the Predictability of the Future , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Koichi Tojo,et al.  Harmonic exponential families on homogeneous spaces , 2020 .

[10]  F. Nielsen On geodesic triangles with right angles in a dually flat space , 2019, Signals and Communication Technology.

[11]  Frédéric Barbaresco,et al.  Lie Group Machine Learning and Gibbs Density on Poincaré Unit Disk from Souriau Lie Groups Thermodynamics and SU(1, 1) Coadjoint Orbits , 2019, GSI.

[12]  Shoichiro Yamaguchi,et al.  A Wrapped Normal Distribution on Hyperbolic Space for Gradient-Based Learning , 2019, ICML.

[13]  Douwe Kiela,et al.  Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry , 2018, ICML.

[14]  Bonnie Berger,et al.  Large-Margin Classification in Hyperbolic Space , 2018, AISTATS.

[15]  Thomas Hofmann,et al.  Hyperbolic Neural Networks , 2018, NeurIPS.

[16]  Christopher De Sa,et al.  Representation Tradeoffs for Hyperbolic Embeddings , 2018, ICML.

[17]  Thomas Hofmann,et al.  Hyperbolic Entailment Cones for Learning Hierarchical Embeddings , 2018, ICML.

[18]  P. Troshin On generalization of Sierpiński gasket in Lobachevskii plane , 2017 .

[19]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[20]  Zhen-Hang Yang,et al.  On approximating the modified Bessel function of the second kind , 2017, Journal of inequalities and applications.

[21]  M. Welling,et al.  Harmonic Exponential Families on Manifolds , 2015, ICML.

[22]  M. Bacák Convex Analysis and Optimization in Hadamard Spaces , 2014 .

[23]  Frank Nielsen,et al.  Visualizing hyperbolic Voronoi diagrams , 2014, SoCG.

[24]  Hirohiko Shima,et al.  Geometry of Hessian Structures , 2013, GSI.

[25]  A. Ungar Möbius Transformation and Einstein Velocity Addition in the Hyperbolic Geometry of Bolyai and Lobachevsky , 2013, 1303.4785.

[26]  Wolfgang Hörmann,et al.  Generating generalized inverse Gaussian random variates , 2013, Statistics and Computing.

[27]  Frank Nielsen,et al.  An Information-Geometric Characterization of Chernoff Information , 2013, IEEE Signal Processing Letters.

[28]  Frank Nielsen,et al.  The hyperbolic Voronoi diagram in arbitrary dimension , 2012, ArXiv.

[29]  Rik Sarkar,et al.  Low Distortion Delaunay Embedding of Trees in Hyperbolic Plane , 2011, GD.

[30]  Frank Nielsen,et al.  Entropies and cross-entropies of exponential families , 2010, 2010 IEEE International Conference on Image Processing.

[31]  Nobuaki Minematsu,et al.  A Study on Invariance of $f$-Divergence and Its Application to Speech Recognition , 2010, IEEE Transactions on Signal Processing.

[32]  Joseph Lipka,et al.  A Table of Integrals , 2010 .

[33]  Hal Daumé,et al.  A geometric view of conjugate priors , 2010, Machine Learning.

[34]  Frank Nielsen,et al.  The Burbea-Rao and Bhattacharyya Centroids , 2010, IEEE Transactions on Information Theory.

[35]  Frank Nielsen,et al.  Statistical exponential families: A digest with flash cards , 2009, ArXiv.

[36]  Frank Nielsen,et al.  Hyperbolic Voronoi Diagrams Made Easy , 2009, 2010 International Conference on Computational Science and Its Applications.

[37]  Inderjit S. Dhillon,et al.  Matrix Nearness Problems with Bregman Divergences , 2007, SIAM J. Matrix Anal. Appl..

[38]  Richard Nock,et al.  On Bregman Voronoi diagrams , 2007, SODA '07.

[39]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[40]  Inderjit S. Dhillon,et al.  Clustering on the Unit Hypersphere using von Mises-Fisher Distributions , 2005, J. Mach. Learn. Res..

[41]  Thomas Hofmann,et al.  Exponential Families for Conditional Random Fields , 2004, UAI.

[42]  I. Vajda,et al.  A new class of metric divergences on probability spaces and its applicability in statistics , 2003 .

[43]  Manfred K. Warmuth,et al.  Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.

[44]  John Stillwell,et al.  Sources of Hyperbolic Geometry , 1996 .

[45]  O. Barndorff-Nielsen,et al.  Decomposition and Invariance of Measures, and Statistical Transformation Models , 1989 .

[46]  P. Blæsild The two-dimensional hyperbolic distribution and related distributions, with an application to Johannsen's bean data , 1981 .

[47]  O. Barndorff-Nielsen Information and Exponential Families in Statistical Theory , 1980 .

[48]  P. Diaconis,et al.  Conjugate Priors for Exponential Families , 1979 .

[49]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[50]  O. Barndorff-Nielsen Exponentially decreasing distributions for the logarithm of particle size , 1977, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[51]  E. M. Andreev ON CONVEX POLYHEDRA OF FINITE VOLUME IN LOBAČEVSKIĬ SPACE , 1970 .

[52]  Edwin T. Jaynes,et al.  Prior Probabilities , 1968, Encyclopedia of Machine Learning.

[53]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[54]  Koichi Tojo,et al.  A q-Analogue of the Family of Poincaré Distributions on the Upper Half Plane , 2023, GSI.

[55]  Koichi Tojo,et al.  An Exponential Family on the Upper Half Plane and Its Conjugate Prior , 2020, SPIGL.

[56]  Elliott Fairchild,et al.  Sectional Curvature in Riemannian Manifolds , 2020 .

[57]  Matthew F. Esplen,et al.  Hyperbolic Geometry , 1997 .

[58]  H. Massam An Exact Decomposition Theorem for a Sample from the Three‐Dimensional Hyperboloid Distribution , 1989 .

[59]  M. L. Eaton Group invariance applications in statistics , 1989 .

[60]  Ole E. Barndorff-Nielsen,et al.  Hyperbolic Distributions and Ramifications: Contributions to Theory and Application , 1981 .

[61]  C. Atkinson Rao's distance measure , 1981 .

[62]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[63]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[64]  F. Jüttner Das Maxwellsche Gesetz der Geschwindigkeitsverteilung in der Relativtheorie , 1911 .