论文信息 - Nonparametric Finite Mixture Models with Possible Shape Constraints: A Cubic Newton Approach

Nonparametric Finite Mixture Models with Possible Shape Constraints: A Cubic Newton Approach

We explore computational aspects of maximum likelihood estimation of the mixture proportions of a nonparametric finite mixture model—a convex optimization problem with old roots in statistics and a key member of the modern data analysis toolkit. Motivated by problems in shape constrained inference, we consider structured variants of this problem with additional convex polyhedral constraints. We propose a new cubic regularized Newton method for this problem and present novel worst-case and local computational guarantees for our algorithm. We extend earlier work by Nesterov and Polyak to the case of a self-concordant objective with polyhedral constraints, such as the ones considered herein. We propose a Frank-Wolfe method to solve the cubic regularized Newton subproblem; and derive efficient solutions for the linear optimization oracles that may be of independent interest. In the particular case of Gaussian mixtures without shape constraints, we derive bounds on how well the finite mixture problem approximates the infinite-dimensional Kiefer-Wolfowitz maximum likelihood estimator. Experiments on synthetic and real datasets suggest that our proposed algorithms exhibit improved runtimes and scalability features over existing benchmarks.

[1] Bodhisattva Sen,et al. Editorial: Special Issue on “Nonparametric Inference Under Shape Constraints” , 2018, Statistical Science.

[2] Mihai Anitescu,et al. A Fast Algorithm for Maximum Likelihood Estimation of Mixture Proportions Using Sequential Quadratic Programming , 2018, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[3] Roger Koenker,et al. Rebayes: an R package for empirical bayes mixture methods , 2017 .

[4] Martin Jaggi,et al. On the Global Linear Convergence of Frank-Wolfe Optimization Variants , 2015, NIPS.

[5] I. Johnstone,et al. Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences , 2004, math/0410088.

[6] R. Olshen,et al. Proceedings of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer , 1985 .

[7] P. Dvurechensky,et al. Self-concordant analysis of Frank-Wolfe algorithms , 2020, ICML.

[8] Lawrence D. Brown,et al. NONPARAMETRIC EMPIRICAL BAYES AND COMPOUND DECISION APPROACHES TO ESTIMATION OF A HIGH-DIMENSIONAL VECTOR OF NORMAL MEANS , 2009, 0908.1712.

[9] Volkan Cevher,et al. Composite self-concordant minimization , 2013, J. Mach. Learn. Res..

[10] Xiaosheng Mu,et al. Log-concavity of a mixture of beta distributions☆ , 2013, 1312.2166.

[11] J. Kiefer,et al. CONSISTENCY OF THE MAXIMUM LIKELIHOOD ESTIMATOR IN THE PRESENCE OF INFINITELY MANY INCIDENTAL PARAMETERS , 1956 .

[12] R. Koenker,et al. CONVEX OPTIMIZATION, SHAPE CONSTRAINTS, COMPOUND DECISIONS, AND EMPIRICAL BAYES RULES , 2013 .

[13] Quoc Tran-Dinh,et al. Generalized self-concordant functions: a recipe for Newton-type methods , 2017, Mathematical Programming.

[14] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .