Near-Optimal Entrywise Anomaly Detection for Low-Rank Matrices with Sub-Exponential Noise

We study the problem of identifying anomalies in a low-rank matrix observed with sub-exponential noise, motivated by applications in retail and inventory management. State of the art approaches to anomaly detection in low-rank matrices apparently fall short, since they require that nonanomalous entries be observed with vanishingly small noise (which is not the case in our problem, and indeed in many applications). So motivated, we propose a conceptually simple entrywise approach to anomaly detection in low-rank matrices. Our approach accommodates a general class of probabilistic anomaly models. We extend recent work on entrywise error guarantees for matrix completion, establishing such guarantees for subexponential matrices, where in addition to missing entries, a fraction of entries are corrupted by (an also unknown) anomaly model. Viewing the anomaly detection as a classification task, to the best of our knowledge, we are the first to achieve the min-max optimal detection rate (up to log factors). Using data from a massive consumer goods retailer, we show that our approach provides significant improvements over incumbent approaches to anomaly detection.

[1]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[2]  Thomas C. M. Lee,et al.  Matrix Completion with Noisy Entries and Outliers , 2015, J. Mach. Learn. Res..

[3]  Alastair R. Hall,et al.  Generalized Method of Moments , 2005 .

[4]  Ananth Raman,et al.  Inventory Record Inaccuracy: An Empirical Analysis , 2008, Manag. Sci..

[5]  S. A. Conrad,et al.  Sales Data and the Estimation of Demand , 1976 .

[6]  Jim Shi,et al.  Production-Inventory Systems with Lost-Sales and Compound Poisson Demands , 2014, Oper. Res..

[7]  A. Raman,et al.  Execution: The Missing Link in Retail Operations , 2001 .

[8]  Xi Chen,et al.  Direct Robust Matrix Factorizatoin for Anomaly Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[9]  Adam J. Mersereau,et al.  Analytics for Operational Visibility in the Retail Store: The Cases of Censored Demand and Inventory Record Inaccuracy , 2015 .

[10]  Joel A. Tropp,et al.  An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..

[11]  Heather Nachtmann,et al.  THE IMPACT OF POINT‐OF‐SALE DATA INACCURACY AND INVENTORY RECORD DATA ERRORS , 2010 .

[12]  G. Imbens,et al.  Information Theoretic Approaches to Inference in Moment Condition Models , 1995 .

[13]  Shiqian Ma,et al.  Efficient Optimization Algorithms for Robust Principal Component Analysis and Its Variants , 2018, Proceedings of the IEEE.

[14]  Xiaohong Chen,et al.  Impact of inventory inaccuracies on products with inventory-dependent demand , 2016 .

[15]  Ji Chen,et al.  Nonconvex Rectangular Matrix Completion via Gradient Descent Without ℓ₂,∞ Regularization , 2020, IEEE Transactions on Information Theory.

[16]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[17]  Linus Schrage,et al.  Retail Inventory Management When Records Are Inaccurate , 2008, Manuf. Serv. Oper. Manag..

[18]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[19]  Xiaodong Li,et al.  Stable Principal Component Pursuit , 2010, 2010 IEEE International Symposium on Information Theory.

[20]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[21]  James T. Kwok,et al.  Accelerated and Inexact Soft-Impute for Large-Scale Matrix and Tensor Completion , 2017, IEEE Transactions on Knowledge and Data Engineering.

[22]  A E Bostwick,et al.  THE THEORY OF PROBABILITIES. , 1896, Science.

[23]  Jarvis D. Haupt,et al.  Minimax Lower Bounds for Noisy Matrix Completion Under Sparse Factor Models , 2015, IEEE Transactions on Information Theory.

[24]  Whitney K. Newey,et al.  LARGE SAMPLE ESTIMATION AND HYPOTHESIS , 1999 .

[25]  Yuling Yan,et al.  Inference and uncertainty quantification for noisy matrix completion , 2019, Proceedings of the National Academy of Sciences.

[26]  Kevin H. Shang,et al.  Inspection and Replenishment Policies for Systems with Inventory Record Inaccuracy , 2007, Manuf. Serv. Oper. Manag..

[27]  Yuling Yan,et al.  Bridging Convex and Nonconvex Optimization in Robust PCA: Noise, Outliers, and Missing Data , 2020, Annals of statistics.

[28]  Martin J. Wainwright,et al.  Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions , 2011, ICML.

[29]  Yuxin Chen,et al.  Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution , 2017, Found. Comput. Math..

[30]  Jianjun Yi,et al.  Benefits of RFID technology for reducing inventory shrinkage , 2014 .

[31]  L. Hansen Large Sample Properties of Generalized Method of Moments Estimators , 1982 .

[32]  A. Tsybakov,et al.  Robust matrix completion , 2014, Probability Theory and Related Fields.

[33]  Jianqing Fan,et al.  ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK. , 2017, Annals of statistics.

[34]  Jean Lafond,et al.  Low Rank Matrix Completion with Exponential Family Noise , 2015, COLT.

[35]  Yang Cao,et al.  Poisson Matrix Recovery and Completion , 2015, IEEE Transactions on Signal Processing.

[36]  Sourav Chatterjee A Deterministic Theory of Low Rank Matrix Completion , 2020, IEEE Transactions on Information Theory.

[37]  Ali Taylan Cemgil,et al.  Bayesian Inference for Nonnegative Matrix Factorisation Models , 2009, Comput. Intell. Neurosci..

[38]  Mark A. Davenport,et al.  Low-rank matrix completion and denoising under Poisson noise , 2019, Information and Inference: A Journal of the IMA.