Instance Based Approximations to Profile Maximum Likelihood

In this paper we provide a new efficient algorithm for approximately computing the profile maximum likelihood (PML) distribution, a prominent quantity in symmetric property estimation. We provide an algorithm which matches the previous best known efficient algorithms for computing approximate PML distributions and improves when the number of distinct observed frequencies in the given instance is small. We achieve this result by exploiting new sparsity structure in approximate PML distributions and providing a new matrix rounding algorithm, of independent interest. Leveraging this result, we obtain the first provable computationally efficient implementation of PseudoPML, a general framework for estimating a broad class of symmetric properties. Additionally, we obtain efficient PML-based estimators for distributions with small profile entropy, a natural instance-based complexity measure. Further, we provide a simpler and more practical PseudoPML implementation that matches the best-known theoretical guarantees of such an estimator and evaluate this method empirically.

[1]  A. Suresh,et al.  Optimal prediction of the number of unseen species , 2016, Proceedings of the National Academy of Sciences.

[2]  Himanshu Tyagi,et al.  The Complexity of Estimating Rényi Entropy , 2015, SODA.

[3]  Robert K. Colwell,et al.  Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages , 2012 .

[4]  M. Blaser,et al.  Molecular analysis of human forearm superficial skin bacterial biota , 2007, Proceedings of the National Academy of Sciences.

[5]  Yanjun Han,et al.  Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance , 2018, COLT.

[6]  J. Bunge,et al.  Estimating the Number of Species: A Review , 1993 .

[7]  Pablo A. Parrilo,et al.  Semidefinite Approximations of the Matrix Logarithm , 2017, Foundations of Computational Mathematics.

[8]  Pascal O. Vontobel The Bethe and Sinkhorn approximations of the pattern maximum likelihood estimate and their connections to the Valiant-Valiant estimate , 2014, 2014 Information Theory and Applications Workshop (ITA).

[9]  Moses Charikar,et al.  The Bethe and Sinkhorn Permanents of Low Rank Matrices and Implications for Profile Maximum Likelihood , 2020, COLT.

[10]  Pascal O. Vontobel The Bethe approximation of the pattern maximum likelihood distribution , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[11]  David R. Karger,et al.  Minimum cuts in near-linear time , 1998, JACM.

[12]  C. Nash-Williams Edge-disjoint spanning trees of finite graphs , 1961 .

[13]  Alon Orlitsky,et al.  The Broad Optimality of Profile Maximum Likelihood , 2019, NeurIPS.

[14]  Yanjun Han,et al.  Minimax Estimation of Functionals of Discrete Distributions , 2014, IEEE Transactions on Information Theory.

[15]  Yingbin Liang,et al.  Estimation of KL divergence between large-alphabet distributions , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[16]  Yihong Wu,et al.  Minimax Rates of Entropy Estimation on Large Alphabets via Best Polynomial Approximation , 2014, IEEE Transactions on Information Theory.

[17]  Moses Charikar,et al.  A General Framework for Symmetric Property Estimation , 2020, NeurIPS.

[18]  Moses Charikar,et al.  Efficient profile maximum likelihood for universal symmetric property estimation , 2019, STOC.

[19]  Alon Orlitsky,et al.  Exact calculation of pattern probabilities , 2010, 2010 IEEE International Symposium on Information Theory.

[20]  James Zou,et al.  Quantifying the unobserved protein-coding variants in human populations provides a roadmap for large-scale sequencing projects , 2015, bioRxiv.

[21]  Himanshu Tyagi,et al.  Estimating Renyi Entropy of Discrete Distributions , 2014, IEEE Transactions on Information Theory.

[22]  Alon Orlitsky,et al.  Algorithms for modeling distributions over large alphabets , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[23]  J. Hughes,et al.  Counting the Uncountable: Statistical Approaches to Estimating Microbial Diversity , 2001, Applied and Environmental Microbiology.

[24]  Timothy Daley,et al.  Predicting the molecular complexity of sequencing libraries , 2013, Nature Methods.

[25]  Tsachy Weissman,et al.  Approximate Profile Maximum Likelihood , 2017, J. Mach. Learn. Res..

[26]  David R. Karger,et al.  A new approach to the minimum cut problem , 1996, JACM.

[27]  D. Relman,et al.  Bacterial diversity within the human subgingival crevice. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[28]  F. Dewhirst,et al.  Bacterial Diversity in Human Subgingival Plaque , 2001, Journal of bacteriology.

[29]  François Le Gall,et al.  Powers of tensors and fast matrix multiplication , 2014, ISSAC.

[30]  Gregory Valiant,et al.  The Power of Linear Estimators , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[31]  Yanjun Han,et al.  Minimax Estimation of KL Divergence between Discrete Distributions , 2016, ArXiv.

[32]  Alon Orlitsky,et al.  Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Discrete Distributions , 2020, ArXiv.

[33]  Josh Alman,et al.  A Refined Laser Method and Faster Matrix Multiplication , 2020, SODA.

[34]  A. Chao Nonparametric estimation of the number of classes in a population , 1984 .

[35]  Cun-Quan Zhang,et al.  Nowhere-zero 3-flows and modulo k-orientations , 2013, J. Comb. Theory, Ser. B.

[36]  Harold N. Gabow,et al.  Forests, frames, and games: algorithms for matroid sums and applications , 1988, STOC '88.

[37]  Alon Orlitsky,et al.  A Unified Maximum Likelihood Approach for Optimal Distribution Property Estimation , 2016, Electron. Colloquium Comput. Complex..

[38]  Ryan O’Donnell,et al.  Lecture notes for CMU’s course on Linear Programming & Semidefinite Programming , 2011 .

[39]  Virginia Vassilevska Williams,et al.  Multiplying matrices faster than coppersmith-winograd , 2012, STOC '12.

[40]  B. Efron,et al.  Estimating the number of unseen species: How many words did Shakespeare know? Biometrika 63 , 1976 .

[41]  Yihong Wu,et al.  Chebyshev polynomials, moment matching, and optimal estimation of the unseen , 2015, The Annals of Statistics.

[42]  Yanjun Han,et al.  Minimax estimation of the L1 distance , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[43]  Yanjun Han,et al.  The Optimality of Profile Maximum Likelihood in Estimating Sorted Discrete Distributions , 2020, ArXiv.

[44]  B. Efron,et al.  Did Shakespeare write a newly-discovered poem? , 1987 .

[45]  Carsten Thomassen Graph factors modulo k , 2014, J. Comb. Theory, Ser. B.

[46]  Gregory Valiant,et al.  Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs , 2011, STOC '11.