Speeding Up Latent Dirichlet allocation with Parallelization and Pipeline Strategies

Previous methods of distributed Gibbs sampling for latent Dirichlet allocation (LDA) run into either memory or communication bottleneck. To improve scalability, this chapter\(^\dagger\) presents two strategies: (1) parallelization—carefully assigning documents among processors based on word locality, and (2) pipelining—masking communication behind computation through a pipeline scheme. In addition, we employ a scheduling algorithm to ensure load balancing both spatially (among machines) and temporally. Experiments show that our strategies can significantly reduce the unparallelizable communication bottleneck and achieve good load balancing, and hence improve LDA’s scalability.

[1]  Max Welling,et al.  Fast collapsed gibbs sampling for latent dirichlet allocation , 2008, KDD.

[2]  Max Welling,et al.  Distributed Algorithms for Topic Models , 2009, J. Mach. Learn. Res..

[3]  Max Welling,et al.  Asynchronous Distributed Learning of Topic Models , 2008, NIPS.

[4]  Marc Snir,et al.  GETTING UP TO SPEED THE FUTURE OF SUPERCOMPUTING , 2004 .

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Thomas L. Griffiths,et al.  Learning author-topic models from text corpora , 2010, TOIS.

[7]  Max Welling,et al.  Distributed Inference for Latent Dirichlet Allocation , 2007, NIPS.

[8]  Zengjian Hu,et al.  On weighted balls-into-bins games , 2005, Theor. Comput. Sci..

[9]  Rajeev Thakur,et al.  Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..

[10]  Gang Wang,et al.  OPTIMOL: automatic Online Picture collecTion via Incremental MOdel Learning , 2007, CVPR.

[11]  Padhraic Smyth,et al.  Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model , 2006, NIPS.

[12]  Mikko H. Lipasti,et al.  Modern Processor Design: Fundamentals of Superscalar Processors , 2002 .

[13]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[14]  Edward Y. Chang,et al.  Confucius and its intelligent disciples , 2010, Proc. VLDB Endow..

[15]  Ricardo da Silva Torres,et al.  Diagnosing Similarity of Oscillation Trends in Time Series , 2007 .

[16]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[17]  Dan Klein,et al.  Fully distributed EM for very large datasets , 2008, ICML '08.

[18]  Zhiyuan Liu,et al.  PLDA+: Parallel latent dirichlet allocation with data placement and pipeline processing , 2011, TIST.

[19]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[20]  James F. Blinn,et al.  Jim Blinn's corner - A trip down the graphics pipeline: line clipping , 1991, IEEE Computer Graphics and Applications.

[21]  Edward Y. Chang,et al.  Collaborative filtering for orkut communities: discovery of user latent behavior , 2009, WWW '09.

[22]  Ramesh Nallapati,et al.  Parallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability , 2007 .

[23]  David Cohn,et al.  Learning to Probabilistically Identify Authoritative Documents , 2000, ICML.

[24]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[25]  David M. Pennock,et al.  Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments , 2001, UAI.

[26]  Edward Y. Chang,et al.  PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications , 2009, AAIM.

[27]  Edward Y. Chang,et al.  2D BubbleUp: Managing Parallel Disks for Media Servers , 1998, FODO.

[28]  Pietro Perona,et al.  Memory bounded inference in topic models , 2008, ICML '08.

[29]  Roi Blanco,et al.  Probabilistic static pruning of inverted files , 2010, TOIS.

[30]  Andrew McCallum,et al.  Organizing the OCA: learning faceted subjects from a library of digital books , 2007, JCDL '07.