Warped Riemannian Metrics for Location-Scale Models

The present contribution shows that warped Riemannian metrics, a class of Riemannian metrics which play a prominent role in Riemannian geometry, are also of fundamental importance in information geometry. Precisely, the starting point is a new theorem, which states that the Rao–Fisher information metric of any location-scale model, defined on a Riemannian manifold, is a warped Riemannian metric, whenever this model is invariant under the action of some Lie group. This theorem is a valuable tool in finding the expression of the Rao–Fisher information metric of location-scale models defined on high-dimensional Riemannian manifolds. Indeed, a warped Riemannian metric is fully determined by only two functions of a single variable, irrespective of the dimension of the underlying Riemannian manifold. Starting from this theorem, several original results are obtained. The expression of the Rao–Fisher information metric of the Riemannian Gaussian model is provided, for the first time in the literature. A generalised definition of the Mahalanobis distance is introduced, which is applicable to any location-scale model defined on a Riemannian manifold. The solution of the geodesic equation, as well as an explicit construction of Riemannian Brownian motion, are obtained, for any Rao–Fisher information metric defined in terms of warped Riemannian metrics. Finally, using a mixture of analytical and numerical computations, it is shown that the parameter space of the von Mises–Fisher model of n-dimensional directional data, when equipped with its Rao–Fisher information metric, becomes a Hadamard manifold, a simply-connected complete Riemannian manifold of negative sectional curvature, for \(n = 2,\ldots ,8\). Hopefully, in upcoming work, this will be proved for any value of n.

[1]  I. Chavel Riemannian Geometry: Subject Index , 2006 .

[2]  R. Ho Algebraic Topology , 2022 .

[3]  Y. Chikuse Statistics on special manifolds , 2003 .

[4]  Curvature of multiply warped products , 2004, math/0406039.

[5]  A. Terras Harmonic Analysis on Symmetric Spaces and Applications I , 1985 .

[6]  Christian Jutten,et al.  Parameters estimate of Riemannian Gaussian distribution in the manifold of covariance matrices , 2016, 2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM).

[7]  Richard L. Smith,et al.  Essentials of Statistical Inference: Index , 2005 .

[8]  S. Lang,et al.  Introduction to Linear Algebra , 1972 .

[9]  N. Čencov Statistical Decision Rules and Optimal Inference , 2000 .

[10]  Jonathan H. Manton,et al.  Riemannian Gaussian Distributions on the Space of Symmetric Positive Definite Matrices , 2015, IEEE Transactions on Information Theory.

[11]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[12]  Jérémy Bensadon Applications of Information Theory to Machine Learning , 2016 .

[13]  B. O'neill Semi-Riemannian Geometry With Applications to Relativity , 1983 .

[14]  Xavier Pennec,et al.  A Riemannian Framework for Tensor Computing , 2005, International Journal of Computer Vision.

[15]  Robert E. Mahony,et al.  Optimization Algorithms on Matrix Manifolds , 2007 .

[16]  N. N. Chent︠s︡ov Statistical decision rules and optimal inference , 1982 .

[17]  R. Bishop,et al.  Manifolds of negative curvature , 1969 .

[18]  Baba C. Vemuri,et al.  A Novel Dynamic System in the Space of SPD Matrices with Applications to Appearance Tracking , 2013, SIAM J. Imaging Sci..

[19]  L. Rogers Stochastic differential equations and diffusion processes: Nobuyuki Ikeda and Shinzo Watanabe North-Holland, Amsterdam, 1981, xiv + 464 pages, Dfl.175.00 , 1982 .

[20]  Christian Jutten,et al.  Riemannian Online Algorithms for Estimating Mixture Model Parameters , 2017, GSI.

[21]  Christian Germain,et al.  Classification approach based on the product of riemannian manifolds from Gaussian parametrization space , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[22]  Marc Arnaudon,et al.  A stochastic algorithm finding generalized means on compact manifolds , 2013, 1305.6295.

[23]  Jérémy Bensadon,et al.  Black-Box Optimization Using Geodesics in Statistical Manifolds , 2013, Entropy.

[24]  Baba C. Vemuri,et al.  Gaussian Distributions on Riemannian Symmetric Spaces: Statistical Learning With Structured Covariance Matrices , 2016, IEEE Transactions on Information Theory.

[25]  Giovanni Gallavotti,et al.  The elements of mechanics , 1983 .

[26]  S. Helgason Differential Geometry and Symmetric Spaces , 1964 .

[27]  B. Ünal Doubly warped products , 2001 .

[28]  N. Ay,et al.  Information geometry and sufficient statistics , 2012, Probability Theory and Related Fields.

[29]  David G. Luenberger,et al.  Linear and Nonlinear Programming: Second Edition , 2003 .

[30]  G. Watson Bessel Functions. (Scientific Books: A Treatise on the Theory of Bessel Functions) , 1923 .

[31]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[32]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[33]  Richard L. Smith,et al.  Essentials of statistical inference , 2005 .

[34]  P. Kloeden,et al.  Numerical Solution of Stochastic Differential Equations , 1992 .

[35]  G. Strang Introduction to Linear Algebra , 1993 .

[36]  James Martens,et al.  New Insights and Perspectives on the Natural Gradient Method , 2014, J. Mach. Learn. Res..

[37]  M. Émery Stochastic Calculus in Manifolds , 1989 .

[38]  Christophe Ley,et al.  Modern Directional Statistics , 2017 .

[39]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[40]  Elton P. Hsu Stochastic analysis on manifolds , 2002 .

[41]  Anne Auger,et al.  Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles , 2011, J. Mach. Learn. Res..

[42]  M. Liao Lévy Processes in Lie Groups by Ming Liao , 2004 .

[43]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[44]  Yee Whye Teh,et al.  Distributed Bayesian Learning with Stochastic Natural Gradient Expectation Propagation and the Posterior Server , 2015, J. Mach. Learn. Res..

[45]  C. Atkinson Rao's distance measure , 1981 .