Graphics Processing Units and High-Dimensional Optimization.

This paper discusses the potential of graphics processing units (GPUs) in high-dimensional optimization problems. A single GPU card with hundreds of arithmetic cores can be inserted in a personal computer and dramatically accelerates many statistical algorithms. To exploit these devices fully, optimization algorithms should reduce to multiple parallel tasks, each accessing a limited amount of data. These criteria favor EM and MM algorithms that separate parameters and data. To a lesser extent block relaxation and coordinate descent and ascent also qualify. We demonstrate the utility of GPUs in nonnegative matrix factorization, PET image reconstruction, and multidimensional scaling. Speedups of 100 fold can easily be attained. Over the next decade, GPUs will fundamentally alter the landscape of computational statistics. It is time for more statisticians to get on-board.

[1]  Marc A. Suchard,et al.  Many-core algorithms for statistical phylogenetics , 2009, Bioinform..

[2]  Christophe Roland,et al.  Squared polynomial extrapolation methods with cycling: an application to the positron emission tomography problem , 2007, Numerical Algorithms.

[3]  K. Lange,et al.  On the Bumpy Road to the Dominant Mode , 2010, Scandinavian journal of statistics, theory and applications.

[4]  Murali Haran,et al.  Parallel multivariate slice sampling , 2011, Stat. Comput..

[5]  Xiao-Li Meng,et al.  [Optimization Transfer Using Surrogate Objective Functions]: Discussion , 2000 .

[6]  Xiao-Li Meng,et al.  The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune , 1997 .

[7]  Hua Zhou,et al.  A quasi-Newton acceleration for high-dimensional optimization algorithms , 2011, Stat. Comput..

[8]  Fabio Cancare,et al.  Accelerating epistasis analysis in human genetics with consumer graphics hardware , 2009, BMC Research Notes.

[9]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[10]  R. Jennrich,et al.  Acceleration of the EM Algorithm by using Quasi‐Newton Methods , 1997 .

[11]  R. Jennrich,et al.  Conjugate Gradient Acceleration of the EM Algorithm , 1993 .

[12]  K. Lange,et al.  The MM Alternative to EM , 2010, 1104.2203.

[13]  J. Leeuw Fitting Distances by Least Squares , 1993 .

[14]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[15]  P. Groenen,et al.  The tunneling method for global optimization in multidimensional scaling , 1996 .

[16]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[17]  R. Varadhan,et al.  Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm , 2008 .

[18]  Fan Meng,et al.  The gputools package enables GPU computing in R , 2010, Bioinform..

[19]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[20]  Cliburn Chan,et al.  Understanding GPU Programming for Statistical Computation: Studies in Massively Parallel Massive Mixtures , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[21]  J. Leeuw Applications of Convex Analysis to Multidimensional Scaling , 2000 .

[22]  Kenneth Lange,et al.  Penalized estimation of haplotype frequencies , 2008, Bioinform..

[23]  Hans-Hermann Bock,et al.  Information Systems and Data Analysis , 1994 .

[24]  K. Lange,et al.  EM reconstruction algorithms for emission and transmission tomography. , 1984, Journal of computer assisted tomography.

[25]  Anjul Patney,et al.  Efficient computation of sum-products on GPUs through software-managed cache , 2008, ICS '08.

[26]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[27]  R. Steele Optimization , 2005 .

[28]  Sharad Goel,et al.  HORSESHOES IN MULTIDIMENSIONAL SCALING AND LOCAL KERNEL METHODS , 2008, 0811.1477.

[29]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[30]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[31]  John M. O. Ranola,et al.  A Poisson model for random multigraphs , 2010, Bioinform..

[32]  Jan de Leeuw,et al.  Block-relaxation Algorithms in Statistics , 1994 .

[33]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[34]  D. Hunter,et al.  Optimization Transfer Using Surrogate Objective Functions , 2000 .

[35]  L. Shepp,et al.  A Statistical Model for Positron Emission Tomography , 1985 .

[36]  D. Rubin,et al.  The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence , 1994 .

[37]  Naonori Ueda,et al.  Deterministic annealing EM algorithm , 1998, Neural Networks.

[38]  Arnaud Doucet,et al.  On the Utility of Graphics Cards to Perform Massively Parallel Simulation of Advanced Monte Carlo Methods , 2009, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[39]  K. Lange,et al.  MM Algorithms for Some Discrete Multivariate Distributions , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[40]  Christopher Holmes,et al.  Some of the What?, Why?, How?, Who? and Where? of Graphics Processing Unit Computing for Bayesian Analysis , 2011 .