An AO-ADMM approach to constraining PARAFAC2 on all modes

Analyzing multi-way measurements with variations across one mode of the dataset is a challenge in various fields including data mining, neuroscience and chemometrics. For example, measurements may evolve over time or have unaligned time profiles. The PARAFAC2 model has been successfully used to analyze such data by allowing the underlying factor matrices in one mode (i.e., the evolving mode) to change across slices. The traditional approach to fit a PARAFAC2 model is to use an alternating least squares-based algorithm, which handles the constant cross-product constraint of the PARAFAC2 model by implicitly estimating the evolving factor matrices. This approach makes imposing regularization on these factor matrices challenging. There is currently no algorithm to flexibly impose such regularization with general penalty functions and hard constraints. In order to address this challenge and to avoid the implicit estimation, in this paper, we propose an algorithm for fitting PARAFAC2 based on alternating optimization with the alternating direction method of multipliers (AO-ADMM). With numerical experiments on simulated data, we show that the proposed PARAFAC2 AO-ADMM approach allows for flexible constraints, recovers the underlying patterns accurately, and is computationally efficient compared to the state-of-the-art. We also apply our model to a real-world chromatography dataset, and show that constraining the evolving mode improves the interpretability of the extracted patterns.

[1]  Amir Beck,et al.  First-Order Methods in Optimization , 2017 .

[2]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[3]  Claus A. Andersson,et al.  PARAFAC2—Part II. Modeling chromatographic data with retention time shifts , 1999 .

[4]  Nicolas Gillis,et al.  Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization , 2011, Neural Computation.

[5]  Nicolas Gillis,et al.  Extrapolated Alternating Algorithms for Approximate Canonical Polyadic Decomposition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  R. Bro,et al.  Solving GC-MS problems with PARAFAC2 , 2008 .

[7]  Rasmus Bro,et al.  Multiway analysis of epilepsy tensors , 2007, ISMB/ECCB.

[8]  Quentin F. Stout,et al.  Unimodal regression via prefix isotonic regression , 2008, Comput. Stat. Data Anal..

[9]  Nathaniel E. Helwig,et al.  The Special Sign Indeterminacy of the Direct-Fitting Parafac2 Model: Some Implications, Cautions, and Recommendations for Simultaneous Component Analysis , 2013, Psychometrika.

[10]  Tulay Adali,et al.  Tracing Network Evolution Using The Parafac2 Model , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Lars Kai Hansen,et al.  Parallel Factor Analysis as an exploratory tool for wavelet transformed event-related EEG , 2006, NeuroImage.

[12]  R. Bro,et al.  A fast non‐negativity‐constrained least squares algorithm , 1997 .

[13]  Henk A. L. Kiers,et al.  Degenerate solutions obtained from several variants of factor analysis , 2002 .

[14]  Laurent Condat,et al.  A Direct Algorithm for 1-D Total Variation Denoising , 2013, IEEE Signal Processing Letters.

[15]  Rasmus Bro,et al.  Solving the sign indeterminacy for multiway models , 2013 .

[16]  M. V. Van Benthem,et al.  Fast algorithm for the solution of large‐scale non‐negativity‐constrained least squares problems , 2004 .

[17]  Michael P. Friedlander,et al.  Computing non-negative tensor factorizations , 2008, Optim. Methods Softw..

[18]  Achi Brandt,et al.  Lean Algebraic Multigrid (LAMG): Fast Graph Laplacian Linear Solver , 2011, SIAM J. Sci. Comput..

[19]  N. Sidiropoulos,et al.  Least squares algorithms under unimodality and non‐negativity constraints , 1998 .

[20]  Morten Mørup,et al.  Quantifying functional connectivity in multi‐subject fMRI data using component models , 2017, Human brain mapping.

[21]  Jeremy E. Cohen,et al.  A Flexible Optimization Framework for Regularized Matrix-Tensor Factorizations with Linear Couplings , 2020, ArXiv.

[22]  Tamara G. Kolda,et al.  Temporal Link Prediction Using Matrix and Tensor Factorizations , 2010, TKDD.

[23]  Nikos D. Sidiropoulos,et al.  Tensors for Data Mining and Data Fusion , 2016, ACM Trans. Intell. Syst. Technol..

[24]  Shmuel Friedland,et al.  The Number of Singular Vector Tuples and Uniqueness of Best Rank-One Approximation of Tensors , 2012, Found. Comput. Math..

[25]  J. Kruskal,et al.  Candelinc: A general approach to multidimensional analysis of many-way arrays with linear constraints on parameters , 1980 .

[26]  Tamara G. Kolda,et al.  Cross-language information retrieval using PARAFAC2 , 2007, KDD '07.

[27]  Rasmus Bro,et al.  MULTI-WAY ANALYSIS IN THE FOOD INDUSTRY Models, Algorithms & Applications , 1998 .

[28]  Jeremy E. Cohen,et al.  An Optimization Framework for Regularized Linearly Coupled Matrix-Tensor Factorization , 2021, 2020 28th European Signal Processing Conference (EUSIPCO).

[29]  M. V. Van Benthem,et al.  Getting to the core of PARAFAC2, a nonnegative approach , 2020 .

[30]  Rasmus Bro,et al.  Improving the speed of multiway algorithms: Part II: Compression , 1998 .

[31]  Nathaniel E. Helwig,et al.  Estimating latent trends in multivariate longitudinal data via Parafac2 with functional and structural constraints , 2017, Biometrical journal. Biometrische Zeitschrift.

[32]  Hans De Sterck,et al.  Nesterov acceleration of alternating least squares for canonical tensor decomposition: Momentum step size selection and restart mechanisms , 2018, Numer. Linear Algebra Appl..

[33]  Bülent Yener,et al.  Unsupervised Multiway Data Analysis: A Literature Survey , 2009, IEEE Transactions on Knowledge and Data Engineering.

[34]  Jimeng Sun,et al.  COPA: Constrained PARAFAC2 for Sparse & Large Datasets , 2018, CIKM.

[35]  F. L. Hitchcock The Expression of a Tensor or a Polyadic as a Sum of Products , 1927 .

[36]  R. Bro,et al.  PARAFAC2—Part I. A direct fitting algorithm for the PARAFAC2 model , 1999 .

[37]  Yu-Jin Zhang,et al.  Nonnegative Matrix Factorization: A Comprehensive Review , 2013, IEEE Transactions on Knowledge and Data Engineering.

[38]  Rasmus Bro,et al.  Accelerating PARAFAC2 algorithms for non-negative complex tensor decomposition , 2021, Chemometrics and Intelligent Laboratory Systems.

[39]  Henk A. L. Kiers,et al.  A three–step algorithm for CANDECOMP/PARAFAC analysis of large data sets with multicollinearity , 1998 .

[40]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[41]  Nikos D. Sidiropoulos,et al.  A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization , 2015, IEEE Transactions on Signal Processing.

[42]  Jimeng Sun,et al.  LogPar: Logistic PARAFAC2 Factorization for Temporal Binary Data with Missing Values , 2020, KDD.

[43]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[44]  Haesun Park,et al.  Fast Nonnegative Matrix Factorization: An Active-Set-Like Method and Comparisons , 2011, SIAM J. Sci. Comput..