Lower Bounds on the Total Variation Distance Between Mixtures of Two Gaussians

Mixtures of high dimensional Gaussian distributions have been studied extensively in statistics and learning theory. While the total variation distance appears naturally in the sample complexity of distribution learning, it is analytically difficult to obtain tight lower bounds for mixtures. Exploiting a connection between total variation distance and the characteristic function of the mixture, we provide fairly tight functional approximations. This enables us to derive new lower bounds on the total variation distance between pairs of two-component Gaussian mixtures that have a shared covariance matrix.

[1]  Sanjeev Arora,et al.  Learning mixtures of arbitrary gaussians , 2001, STOC '01.

[2]  S. Barsov,et al.  Estimates of the proximity of Gaussian measures , 1987 .

[3]  Jerry Li,et al.  Mixture models, robustness, and sum of squares proofs , 2017, STOC.

[4]  E. S. Pearson,et al.  CONTRIBUTIONS TO THE THEORY OF TESTING STATISTICAL HYPOTHESES , 1967, Joint Statistical Papers.

[5]  Nhat Ho,et al.  Uniform Convergence Rates for Maximum Likelihood Estimation under Two-Component Gaussian Mixture Models , 2020, 2006.00704.

[6]  Jing Liu,et al.  Survey of Wireless Indoor Positioning Techniques and Systems , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[7]  Nicholas J. A. Harvey,et al.  Near-optimal Sample Complexity Bounds for Robust Learning of Gaussian Mixtures via Compression Schemes , 2017, J. ACM.

[8]  Guy Bresler,et al.  Optimal Average-Case Reductions to Sparse PCA: From Weak Assumptions to Strong Hardness , 2019, COLT.

[9]  J. Kahn,et al.  Strong identifiability and optimal minimax rates for finite mixture estimation , 2018, The Annals of Statistics.

[10]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.

[11]  Christos Tzamos,et al.  Ten Steps of EM Suffice for Mixtures of Two Gaussians , 2016, COLT.

[12]  Ankur Moitra,et al.  Algorithmic Aspects of Machine Learning , 2018 .

[13]  Stephen E. Fienberg,et al.  Testing Statistical Hypotheses , 2005 .

[14]  Jiahua Chen,et al.  Hypothesis test for normal mixture models: The EM approach , 2009, 0908.3428.

[15]  Ankur Moitra,et al.  Settling the Polynomial Learnability of Mixtures of Gaussians , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[16]  Daniel M. Kane,et al.  Robust Learning of Mixtures of Gaussians , 2020, SODA.

[17]  Alexandros G. Dimakis,et al.  Learning Distributions Generated by One-Layer ReLU Networks , 2019, NeurIPS.

[18]  Thomas Steinke,et al.  Private Hypothesis Selection , 2019, IEEE Transactions on Information Theory.

[19]  Yihong Wu,et al.  Optimal estimation of high-dimensional Gaussian mixtures , 2020, ArXiv.

[20]  L. Devroye,et al.  The total variation distance between high-dimensional Gaussians , 2018, 1810.08693.

[21]  Nhat Ho,et al.  Convergence rates of parameter estimation for some weakly identifiable finite mixtures , 2016 .

[22]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[23]  D. Rubin,et al.  Estimation and Hypothesis Testing in Finite Mixture Models , 1985 .

[24]  Akshay Krishnamurthy,et al.  Algebraic and Analytic Approaches for Parameter Learning in Mixture Models , 2020, ALT.

[25]  Moritz Hardt,et al.  Tight Bounds for Learning a Mixture of Two Gaussians , 2014, STOC.

[26]  Luke W. Miratrix,et al.  Weak Separation in Mixture Models and Implications for Principal Stratification , 2016, AISTATS.

[27]  Sanjoy Dasgupta,et al.  A Two-Round Variant of EM for Gaussian Mixtures , 2000, UAI.

[28]  Henry Tirri,et al.  Topics in probabilistic location estimation in wireless networks , 2004, 2004 IEEE 15th International Symposium on Personal, Indoor and Mobile Radio Communications (IEEE Cat. No.04TH8754).

[29]  Ilias Diakonikolas,et al.  Outlier-Robust Clustering of Gaussians and Other Non-Spherical Mixtures , 2020, 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS).

[30]  Jonathan Ullman,et al.  Private Identity Testing for High-Dimensional Distributions , 2019, NeurIPS.

[31]  K. Pearson Contributions to the Mathematical Theory of Evolution , 1894 .