Towards an Efficient Multi-way Factorization of Multi-dimensional Big Data across a GPU Cluster

It has long been an important issue in various disciplines to examine massive multi-dimensional data by extracting the embedded multi-way factors. With the quick increases in both scales and dimensions of data under analysis, research challenges arise in order to reflect the dynamics of large-scale tensors while introducing no significant distortions in the factorization procedure in sophisticated applications. A massively parallel computing framework, namely H-PARAFAC, has been developed to enable Parallel Factor Analysis (PARAFAC) of massive tensors upon a "divide-and-conquer" theory (a modified alternating least squares approach). The hierarchical framework incorporates a coarse-grained model for coordinating the processing of sub tensors and a fine-grained parallel model for computing each sub tensor and fusing sub-factors. Experiments have been performed on a GPU cluster, and the results indicate that (1) the proposed method breaks the limitation on the size of data to be factorized, and (2) it dramatically outperforms the traditional counterparts in terms of both scalability and efficiency, e.g., The runtime increases linearly with the data volume increases in the order of n3.

[1]  Andrzej Cichocki,et al.  PARAFAC algorithms for large-scale problems , 2011, Neurocomputing.

[2]  Lizhe Wang,et al.  Fast and Scalable Multi-Way Analysis of Massive Neural Data , 2015, IEEE Transactions on Computers.

[3]  Lars Kai Hansen,et al.  Parallel Factor Analysis as an exploratory tool for wavelet transformed event-related EEG , 2006, NeuroImage.

[4]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[5]  Pierre Comon,et al.  Enhanced Line Search: A Novel Method to Accelerate PARAFAC , 2008, SIAM J. Matrix Anal. Appl..

[6]  Fumikazu Miwakeichi,et al.  Decomposing EEG data into space–time–frequency components using Parallel Factor Analysis , 2004, NeuroImage.

[7]  Andrzej Cichocki,et al.  CANDECOMP/PARAFAC Decomposition of High-Order Tensors Through Tensor Reshaping , 2012, IEEE Transactions on Signal Processing.

[8]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[9]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[10]  Andrzej Cichocki,et al.  Fast Alternating LS Algorithms for High Order CANDECOMP/PARAFAC Tensor Factorizations , 2013, IEEE Transactions on Signal Processing.

[11]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[12]  Andrzej Cichocki,et al.  Advances in PARAFAC Using Parallel Block Decomposition , 2009, ICONIP.

[13]  P. J. Narayanan,et al.  Singular value decomposition on GPU using CUDA , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[14]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[15]  R. Bro PARAFAC. Tutorial and applications , 1997 .

[16]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[17]  Patrick Dupont,et al.  Canonical decomposition of ictal scalp EEG reliably detects the seizure onset zone , 2007, NeuroImage.

[18]  Gerrit Kateman,et al.  Generalized rank annihilation method. I: Derivation of eigenvalue problems , 1994 .

[19]  Chengbiao Lu,et al.  Characteristics of Evoked Potential Multiple EEG Recordings in Patients with Chronic Pain by Means of Parallel Factor Analysis , 2012, Comput. Math. Methods Medicine.

[20]  R. Harshman,et al.  PARAFAC: parallel factor analysis , 1994 .