Tensor Factorization for Heterogeneous and Spatio-temporal Data

We observe various kinds of interesting events occurring in our daily lives, and try to represent these events in a simple manner. A relational model is one of the de facto systems used to digitally store these events due to its flexible representation abilities compared to other data models that have strict predefined structures. We try to discover interpretable patterns from the relational data, which is a traditional way to understand the latent characteristic or trend in subjective phenomena. These patterns are employed to recover missing records that have not observed in relational data. We can obtain better predictions of future events by leveraging the latent trends than by directly using the observed data. However, in practical data analysis, we often encounter situations where the relational data are sparse and noisy, which means that the data contain fewer messy records than the total number of combinations of the attributes. Pattern extraction methods may provide undesired results due to the lack of information sources. In this dissertation, we attempt to solve the sparseness and noise problems by developing data analysis methods that adopt side information to discover interpretable and useful patterns. There are three essential challenges in this dissertation: (1) develop a method to merge heterogeneous information of associated relational data to extract patterns, (2) leverage spatio-temporal structures that can be represented as graphs or groups of data attributes to encourage latent patterns to be smooth, and (3) learn dependency for attributes in a data-driven manner and modify spatiotemporal structures to capture the patterns suitable for prediction problems. To tackle these challenges, we propose tensor factorization methods that extract an interpretable pattern by decomposing a tensor into latent factors. For the first challenge, we develop Non-negative Multiple Matrix Factorization (NMF), which decomposes a target matrix and auxiliary matrices simultaneously. As an exten-

[1]  Qiang Yang,et al.  Detect and Track Latent Factors with Online Nonnegative Matrix Factorization , 2007, IJCAI.

[2]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[3]  Arthur Getis,et al.  A History of the Concept of Spatial Autocorrelation: A Geographer's Perspective , 2008 .

[4]  Antonin Chambolle,et al.  On Total Variation Minimization and Surface Evolution Using Parametric Maximum Flows , 2009, International Journal of Computer Vision.

[5]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[6]  Scott Sanner,et al.  New objective functions for social collaborative filtering , 2012, WWW.

[7]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[8]  Tamara G. Kolda,et al.  On Tensors, Sparsity, and Nonnegative Factorizations , 2011, SIAM J. Matrix Anal. Appl..

[9]  Hanghang Tong,et al.  Facets: Fast Comprehensive Mining of Coevolving High-order Time Series , 2015, KDD.

[10]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[11]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[12]  Feng Gao,et al.  CityBench: A Configurable Benchmark to Evaluate RSP Engines Using Smart City Datasets , 2015, SEMWEB.

[13]  Koh Takeuchi,et al.  Non-negative Multiple Tensor Factorization , 2013, 2013 IEEE 13th International Conference on Data Mining.

[14]  Kazuyuki Aihara,et al.  Size-constrained Submodular Minimization through Minimum Norm Base , 2011, ICML.

[15]  Christoph Schnörr,et al.  Controlling Sparseness in Non-negative Tensor Factorization , 2006, ECCV.

[16]  Inderjit S. Dhillon,et al.  Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction , 2016, NIPS.

[17]  Wu-Jun Li,et al.  Relation regularized matrix factorization , 2009, IJCAI 2009.

[18]  Koh Takeuchi,et al.  Higher Order Fused Regularization for Supervised Learning with Grouped Parameters , 2015, ECML/PKDD.

[19]  Kush R. Varshney,et al.  Dynamic matrix factorization: A state space approach , 2011, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[21]  Andrzej Cichocki,et al.  Nonnegative Tensor Factorization for Continuous EEG Classification , 2007, Int. J. Neural Syst..

[22]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[23]  Xuan Song,et al.  CitySpectrum: a non-negative tensor factorization approach , 2014, UbiComp.

[24]  Pradeep Ravikumar,et al.  Collaborative Filtering with Graph Information: Consistency and Scalable Methods , 2015, NIPS.

[25]  Svetha Venkatesh,et al.  Nonnegative shared subspace learning and its application to social media retrieval , 2010, KDD.

[26]  Qiang Zhang,et al.  A Parallel Nonnegative Tensor Factorization Algorithm for Mining Global Climate Data , 2009, ICCS.

[27]  Derry Fitzgerald,et al.  Sound Source Separation Using Shifted Non-Negative Tensor Factorisation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[28]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[29]  Shashi Shekhar,et al.  Spatiotemporal change footprint pattern discovery: an inter‐disciplinary survey , 2014, WIREs Data Mining Knowl. Discov..

[30]  Hans-Peter Kriegel,et al.  Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.

[31]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[32]  Alexey Ozerov,et al.  Notes on Nonnegative Tensor Factorization of the Spectrogram for Audio Source Separation: Statistical Insights and Towards Self-Clustering of the Spatial Cues , 2010, CMMR.

[33]  Luc Anselin,et al.  Spatial Dependence in Linear Regression Models with an Introduction to Spatial Econometrics , 1995 .

[34]  Christos Faloutsos,et al.  DynaMMo: mining and summarization of coevolving sequences with missing values , 2009, KDD.

[35]  Ali Taylan Cemgil,et al.  Probabilistic Latent Tensor Factorization , 2010, LVA/ICA.

[36]  Michael R. Lyu,et al.  SoRec: social recommendation using probabilistic matrix factorization , 2008, CIKM '08.

[37]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[38]  Jure Leskovec,et al.  Learning Attitudes and Attributes from Multi-aspect Reviews , 2012, 2012 IEEE 12th International Conference on Data Mining.

[39]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[40]  Satoru Fujishige,et al.  Submodular functions and optimization , 1991 .

[41]  Hisashi Kashima,et al.  Tensor factorization using auxiliary information , 2011, Data Mining and Knowledge Discovery.

[42]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[43]  Yu Zheng,et al.  U-Air: when urban air quality inference meets big data , 2013, KDD.

[44]  Ali Taylan Cemgil,et al.  Bayesian Inference for Nonnegative Matrix Factorisation Models , 2009, Comput. Intell. Neurosci..

[45]  Nikos D. Sidiropoulos,et al.  A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization , 2015, IEEE Transactions on Signal Processing.

[46]  Peter M. Atkinson,et al.  Geostatistics and remote sensing , 1998 .

[47]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[49]  Yan Liu,et al.  Collaborative Topic Regression with Social Matrix Factorization for Recommendation Systems , 2012, ICML.

[50]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[51]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[52]  Inderjit S. Dhillon,et al.  Generalized Nonnegative Matrix Approximations with Bregman Divergences , 2005, NIPS.

[53]  Markku Hauta-Kasari,et al.  Nonnegative Tensor Factorization Accelerated Using GPGPU , 2011, IEEE Transactions on Parallel and Distributed Systems.

[54]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[55]  Nicolas Gillis,et al.  The Why and How of Nonnegative Matrix Factorization , 2014, ArXiv.

[56]  Robert E. Tarjan,et al.  A Fast Parametric Maximum Flow Algorithm and Applications , 1989, SIAM J. Comput..

[57]  Philip S. Yu,et al.  Bicycle-Sharing System Analysis and Trip Prediction , 2016, 2016 17th IEEE International Conference on Mobile Data Management (MDM).

[58]  Gene Cheung,et al.  Graph Laplacian Regularization for Image Denoising: Analysis in the Continuous Domain , 2016, IEEE Transactions on Image Processing.

[59]  Guillaume Bouchard,et al.  Convex Collective Matrix Factorization , 2013, AISTATS.

[60]  Qi Yu,et al.  Fast Multivariate Spatio-temporal Analysis via Low Rank Tensor Learning , 2014, NIPS.

[61]  Luc Anselin,et al.  Under the hood , 2002 .

[62]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2013, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[64]  Kay W. Axhausen,et al.  Understanding urban mobility patterns with a probabilistic tensor factorization framework , 2016 .

[65]  Wei Chu,et al.  Probabilistic Models for Incomplete Multi-dimensional Arrays , 2009, AISTATS.

[66]  G. Matheron Principles of geostatistics , 1963 .

[67]  Chris H. Q. Ding,et al.  NMF and PLSI: equivalence and a hybrid algorithm , 2006, SIGIR '06.

[68]  Yin Zhang,et al.  An alternating direction algorithm for matrix completion with nonnegative factors , 2011, Frontiers of Mathematics in China.

[69]  Andrzej Cichocki,et al.  Hierarchical ALS Algorithms for Nonnegative Matrix and 3D Tensor Factorization , 2007, ICA.

[70]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[71]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[72]  Max Welling,et al.  Positive tensor factorization , 2001, Pattern Recognit. Lett..

[73]  Suvrit Sra,et al.  Sparse nonnegative matrix approximation: new formulations and algorithms , 2010 .

[74]  Andrzej Cichocki,et al.  Novel Multi-layer Non-negative Tensor Factorization with Sparsity Constraints , 2007, ICANNGA.

[75]  J. Hoef,et al.  Spatial statistical models that use flow and stream distance , 2006, Environmental and Ecological Statistics.

[76]  Koh Takeuchi,et al.  Towards Automatic Image Understanding and Mining via Social Curation , 2012, 2012 IEEE 12th International Conference on Data Mining.

[77]  Stefanos Zafeiriou,et al.  Nonnegative tensor factorization as an alternative Csiszar–Tusnady procedure: algorithms, convergence, probabilistic interpretations and novel probabilistic tensor latent variable analysis algorithms , 2011, Data Mining and Knowledge Discovery.

[78]  Mark D. Plumbley,et al.  Polyphonic music transcription by non-negative sparse coding of power spectra , 2004 .

[79]  Jure Leskovec,et al.  From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews , 2013, WWW.

[80]  Jimmy J. Lin,et al.  Smoothing techniques for adaptive online language models: topic tracking in tweet streams , 2011, KDD.

[81]  Filson H. Glanz,et al.  Correction to "An Autoregressive Model Approach to Two-Dimensional Shape Classification" , 1986, IEEE Trans. Pattern Anal. Mach. Intell..

[82]  Ee-Peng Lim,et al.  Modeling Temporal Adoptions Using Dynamic Matrix Factorization , 2013, 2013 IEEE 13th International Conference on Data Mining.

[83]  Francis R. Bach,et al.  Structured sparsity-inducing norms through submodular functions , 2010, NIPS.

[84]  Hirokazu Kameoka,et al.  Bayesian nonparametric spectrogram modeling based on infinite factorial infinite hidden Markov model , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[85]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[86]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[87]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[88]  Xuelong Li,et al.  A-Optimal Non-negative Projection for image representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[89]  P. Diggle Applied Spatial Statistics for Public Health Data , 2005 .

[90]  Chao Liu,et al.  Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce , 2010, WWW '10.

[91]  Akio Watanabe,et al.  Spatio-temporal factorization of log data for understanding network events , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[92]  László Lovász,et al.  Submodular functions and convexity , 1982, ISMP.

[93]  Ole Winther,et al.  Bayesian Non-negative Matrix Factorization , 2009, ICA.

[94]  Kevin Duh,et al.  Creating Stories: Social Curation of Twitter Messages , 2012, ICWSM.

[95]  Daqing Zhang,et al.  Modeling User Activity Preference by Leveraging User Spatial Temporal Characteristics in LBSNs , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[96]  Fei Wang,et al.  Efficient Document Clustering via Online Nonnegative Matrix Factorizations , 2011, SDM.

[97]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[98]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[99]  Turk Paul Wais,et al.  Towards Building a High-Quality Workforce with Mechanical , 2010 .

[100]  Perry R. Cook,et al.  Bayesian Nonparametric Matrix Factorization for Recorded Music , 2010, ICML.

[101]  A. Bruckstein,et al.  On the uniqueness of overcomplete dictionaries, and a practical way to retrieve them , 2006 .

[102]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[103]  James B. Elsner,et al.  Spatial grids for hurricane climate research , 2012, Climate Dynamics.

[104]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[105]  Fabien Moutarde,et al.  Analysis of Large-Scale Traffic Dynamics in an Urban Transportation Network Using Non-Negative Tensor Factorization , 2014, International Journal of Intelligent Transportation Systems Research.

[106]  Luc Anselin,et al.  Thirty years of spatial econometrics , 2010 .

[107]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[108]  Wen Gao,et al.  Efficient Generalized Fused Lasso and its Application to the Diagnosis of Alzheimer's Disease , 2014, AAAI.

[109]  Tamir Hazan,et al.  Sparse image coding using a 3D non-negative tensor factorization , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[110]  Suvrit Sra,et al.  Fast Newton-type Methods for Total Variation Regularization , 2011, ICML.

[111]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[112]  Alexander J. Smola,et al.  Trend Filtering on Graphs , 2014, J. Mach. Learn. Res..

[113]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[114]  Lars Schmidt-Thieme,et al.  Pairwise interaction tensor factorization for personalized tag recommendation , 2010, WSDM '10.

[115]  L. Anselin Spatial Econometrics: Methods and Models , 1988 .

[116]  Yukihiko Yamashita,et al.  Linked PARAFAC/CP Tensor Decomposition and Its Fast Implementation for Multi-block Tensor Analysis , 2012, ICONIP.

[117]  Chong Wang,et al.  Latent Collaborative Retrieval , 2012, ICML.

[118]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[119]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[120]  Lei Li,et al.  Multilinear Dynamical Systems for Tensor Time Series , 2013, NIPS.