Scalable Data Clustering using GPU Clusters

The computational demands of multivariate clustering grow rapidly, and therefore processing large data sets, like those found in flow cytometry data, is very time consuming on a single CPU. Fortunately these techniques lend themselves naturally to large scale parallel processing. To address the computational demands, graphics processing units, specifically NVIDIA’s CUDA framework and Tesla architecture, were investigated as a low-cost, high performance solution to a number of clustering algorithms. C-means and Expectation Maximization with Gaussian mixture models were implemented using the CUDA framework. The algorithm implementations use a hybrid of CUDA, OpenMP, and MPI to scale to many GPUs on multiple nodes in a high performance computing environment. This framework is envisioned as part of a larger cloud-based workflow service where biologists can apply multiple algorithms and parameter sweeps to their data sets and quickly receive a thorough set of results that can be further analyzed by experts. Improvements over previous GPU-accelerated implementations range from 1.42x to 21x for C-means and 3.72x to 5.65x for the Gaussian mixture model on non-trivial data sets. Using a single NVIDIA GTX 260 speedups are on average 90x for C-means and 74x for Gaussians with flow cytometry files compared to optimized C code running on a single core of a modern Intel CPU. Using the TeraGrid “Lincoln” high performance cluster at NCSA C-means achieves 42% parallel efficiency and a CPU speedup of 4794x with 128 Tesla C1060 GPUs. The Gaussian mixture model achieves 72% parallel efficiency and a CPU speedup of 6286x. Copyright c © 2011 John Wiley & Sons, Ltd.

[1]  Cliburn Chan,et al.  Understanding GPU Programming for Statistical Computation: Studies in Massively Parallel Massive Mixtures , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[2]  Jianhong Wu,et al.  Data clustering - theory, algorithms, and applications , 2007 .

[3]  Manoranjan Dash,et al.  Graphics Hardware based Efficient and Scalable Fuzzy C-Means Clustering , 2008, AusDM.

[4]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[5]  Kevin Skadron,et al.  A performance study of general-purpose applications on graphics processors using CUDA , 2008, J. Parallel Distributed Comput..

[6]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[7]  Yuliya Tarabalka,et al.  Real-time anomaly detection in hyperspectral images using multivariate normal mixture models and GPU processing , 2009, Journal of Real-Time Image Processing.

[8]  Gregor von Laszewski,et al.  Accelerating Partitional Algorithms for Flow Cytometry on GPUs , 2009, 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[9]  James M. Keller,et al.  Speedup of Fuzzy Clustering Through Stream Processing on Graphics Processing Units , 2008, IEEE Transactions on Fuzzy Systems.

[10]  James M. Keller,et al.  Incorporation of Non-euclidean Distance Metrics into Fuzzy Clustering on Graphics Processing Units , 2007, Analysis and Design of Intelligent Systems using Soft Computing Techniques.

[11]  Hiroaki Kobayashi,et al.  Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing , 2006, The Journal of Supercomputing.

[12]  Erik Lindholm,et al.  NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.

[13]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[14]  SkadronKevin,et al.  A performance study of general-purpose applications on graphics processors using CUDA , 2008 .

[15]  Ian Buck,et al.  Fast Parallel Expectation Maximization for Gaussian Mixture Models on GPUs Using CUDA , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.

[16]  Jill P. Mesirov,et al.  Automated High-Dimensional Flow Cytometric Data Analysis , 2010, RECOMB.