Scalable Low-Rank Autoregressive Tensor Learning for Spatiotemporal Traffic Data Imputation

Missing value problem in spatiotemporal traffic data has long been a challenging topic, in particular for large-scale and high-dimensional data with complex missing mechanisms and diverse degrees of missingness. Recent studies based on tensor nuclear norm have demonstrated the superiority of tensor learning in imputation tasks by effectively characterizing the complex correlations/dependencies in spatiotemporal data. However, despite the promising results, these approaches do not scale well to large tensors. In this paper, we focus on addressing the missing data imputation problem for large-scale spatiotemporal traffic data. To achieve both high accuracy and efficiency, we develop a scalable autoregressive tensor learning model---Low-Tubal-Rank Autoregressive Tensor Completion (LATC-Tubal)---based on the existing framework of Low-Rank Autoregressive Tensor Completion (LATC), which is well-suited for spatiotemporal traffic data that characterized by multidimensional structure of location$\times$ time of day $\times$ day. In particular, the proposed LATC-Tubal model involves a scalable tensor nuclear norm minimization scheme by integrating linear unitary transformation. Therefore, the tensor nuclear norm minimization can be solved by singular value thresholding on the transformed matrix of each day while the day-to-day correlation can be effectively preserved by the unitary transform matrix. Before setting up the experiment, we consider two large-scale 5-minute traffic speed data sets collected by the California PeMS system with 11160 sensors. We compare LATC-Tubal with state-of-the-art baseline models, and find that LATC-Tubal can achieve competitively accuracy with a significantly lower computational cost. In addition, the LATC-Tubal will also benefit other tasks in modeling large-scale spatiotemporal traffic data, such as network-level traffic forecasting.

[1]  Yunchao Wei,et al.  Low-Rank Tensor Completion With a New Tensor Nuclear Norm Induced by Invertible Linear Transforms , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Wei Liu,et al.  Tensor Robust Principal Component Analysis with a New Tensor Nuclear Norm , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Hua-Liang Wei,et al.  Handling missing data in multivariate time series using a vector autoregressive model-imputation (VAR-IM) algorithm , 2018, Neurocomputing.

[4]  Lijun Sun,et al.  Bayesian Temporal Factorization for Multidimensional Time Series Prediction , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Michael K. Ng,et al.  Robust tensor completion using transformed tensor singular value decomposition , 2020, Numer. Linear Algebra Appl..

[6]  Yi Zhang,et al.  A BPCA based missing value imputing method for traffic flow volume data , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[7]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[8]  Yi Zhang,et al.  Trend Modeling for Traffic Time Series Analysis: An Integrated Study , 2015, IEEE Transactions on Intelligent Transportation Systems.

[9]  Wei Liu,et al.  Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  J. Shao,et al.  Nearest Neighbor Imputation for Survey Data , 2000 .

[11]  P. Yip,et al.  Discrete Cosine Transform: Algorithms, Advantages, Applications , 1990 .

[12]  Muhammad Tayyab Asif,et al.  Matrix and Tensor Based Methods for Missing Data Estimation in Large Traffic Networks , 2016, IEEE Transactions on Intelligent Transportation Systems.

[13]  Xinyu Chen,et al.  A Nonconvex Low-Rank Tensor Completion Model for Spatiotemporal Traffic Data Imputation , 2020, Transportation Research Part C: Emerging Technologies.

[14]  Bin Ran,et al.  Tensor based missing traffic data completion with spatial–temporal correlation , 2016 .

[15]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2013, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Lijun Sun,et al.  A Bayesian tensor decomposition approach for spatiotemporal traffic data imputation , 2019, Transportation Research Part C: Emerging Technologies.

[17]  Muhammad Tayyab Asif,et al.  Low-dimensional models for missing data imputation in road networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Yi Zhang,et al.  PPCA-Based Missing Data Imputation for Traffic Flow Volume: A Systematical Approach , 2009, IEEE Transactions on Intelligent Transportation Systems.

[19]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[20]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[21]  Xinyu Chen,et al.  Low-Rank Autoregressive Tensor Completion for Multivariate Time Series Forecasting , 2020, ArXiv.

[22]  M. Kilmer,et al.  Tensor-Tensor Products with Invertible Linear Transforms , 2015 .