ALO-NMF: Accelerated Locality-Optimized Non-negative Matrix Factorization

Non-negative Matrix Factorization (NMF) is a key kernel for unsupervised dimension reduction used in a wide range of applications, including graph mining, recommender systems and natural language processing. Due to the compute-intensive nature of applications that must perform repeated NMF, several parallel implementations have been developed. However, existing parallel NMF algorithms have not addressed data locality optimizations, which are critical for high performance since data movement costs greatly exceed the cost of arithmetic/logic operations on current computer systems. In this paper, we present a novel optimization method for parallel NMF algorithm based on the HALS (Hierarchical Alternating Least Squares) scheme that incorporates algorithmic transformations to enhance data locality. Efficient realizations of the algorithm on multi-core CPUs and GPUs are developed, demonstrating a new Accelerated Locality-Optimized NMF (ALO-NMF) that obtains up to 2.29x lower data movement cost and up to 4.45x speedup over existing state-of-the-art parallel NMF algorithms.

[1]  Balaraman Ravindran,et al.  A Unified Non-Negative Matrix Factorization Framework for Semi Supervised Learning on Graphs , 2020, SDM.

[2]  Andrzej Cichocki,et al.  Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations , 2009, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[3]  Jaegul Choo,et al.  Local Topic Discovery via Boosted Ensemble of Nonnegative Matrix Factorization , 2017, IJCAI.

[4]  Haesun Park,et al.  A high-performance parallel algorithm for nonnegative matrix factorization , 2015, PPoPP.

[5]  Huijie Zhao,et al.  Parallel Nonnegative Matrix Factorization Algorithm on the Distributed Memory Platform , 2010, International Journal of Parallel Programming.

[6]  Yin Zhang,et al.  Accelerating the Lee-Seung Algorithm for Nonnegative Matrix Factorization , 2005 .

[7]  Jian Pei,et al.  Community Preserving Network Embedding , 2017, AAAI.

[8]  Tyler M. Smith,et al.  Theory and practice of classical matrix-matrix multiplication for hierarchical memory architectures , 2018 .

[9]  Fernando Ortega,et al.  A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model , 2016, Knowl. Based Syst..

[10]  Srinivasan Parthasarathy,et al.  Network Representation Learning: Consolidation and Renewed Bearing , 2019, ArXiv.

[11]  David Wessel,et al.  Accelerating Non-Negative Matrix Factorization for Audio Source Separation on Multi-Core and Many-Core Architectures , 2009, ISMIR.

[12]  Chao Liu,et al.  Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce , 2010, WWW '10.

[13]  Peyman Kabiri,et al.  A Novel Non-Negative Matrix Factorization Method for Recommender Systems , 2015 .

[14]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[15]  Hyunsoo Kim,et al.  Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method , 2008, SIAM J. Matrix Anal. Appl..

[16]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[17]  Sven Koitka,et al.  nmfgpu4R: GPU-Accelerated Computation of the Non-Negative Matrix Factorization (NMF) Using CUDA Capable Hardware , 2016, R J..

[18]  Shuigeng Zhou,et al.  CloudNMF: A MapReduce Implementation of Nonnegative Matrix Factorization for Large-scale Biological Datasets , 2014, Genom. Proteom. Bioinform..

[19]  Daniel Sunderland,et al.  Kokkos: Enabling manycore performance portability through polymorphic memory access patterns , 2014, J. Parallel Distributed Comput..

[20]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[21]  Jaegul Choo,et al.  Short-Text Topic Modeling via Non-negative Matrix Factorization Enriched with Local Word-Context Correlations , 2018, WWW.

[22]  Haesun Park,et al.  Fast Nonnegative Matrix Factorization: An Active-Set-Like Method and Comparisons , 2011, SIAM J. Sci. Comput..

[23]  Stefan A. Robila,et al.  A parallel unmixing algorithm for hyperspectral images , 2006, SPIE Optics East.

[24]  Noel Lopes,et al.  Non-negative Matrix Factorization Implementation Using Graphic Processing Units , 2010, IDEAL.

[25]  Nicolas Gillis,et al.  The Why and How of Nonnegative Matrix Factorization , 2014, ArXiv.

[26]  Francisco Tirado,et al.  NMF-mGPU: non-negative matrix factorization on multi-GPU systems , 2015, BMC Bioinformatics.

[27]  Andrzej Cichocki,et al.  Hierarchical ALS Algorithms for Nonnegative Matrix and 3D Tensor Factorization , 2007, ICA.

[28]  David A. Bader,et al.  Behavioral clusters in dynamic graphs , 2015, Parallel Comput..