Non‐negative residual matrix factorization: problem definition, fast solutions, and applications

Matrix factorization is a very powerful tool to find graph patterns, e.g. communities, anomalies, etc. A recent trend is to improve the usability of the discovered graph patterns, by encoding some interpretation-friendly properties (e.g., non-negativity, sparseness, etc) in the factorization. Most, if not all, of these methods are tailored for the task of community detection.We propose NrMF, a non-negative residual matrix factorization framework, aiming to improve the interpretation for graph anomaly detection. We present two optimization formations and their corresponding optimization solutions. Our method can naturally capture abnormal behaviors on graphs. We further generalize it to admit sparse constrains in the residual matrix. The effectiveness and efficiency of the proposed algorithms are analyzed, showing that our algorithm (i) leads to a local optima; and (ii) scales to large graphs. The experimental results on several data sets validate its effectiveness as well as efficiency. © 2011 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 5: 3–15, 2012, © 2012 Wiley Periodicals, Inc. (Research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-09-2-0053. It is continuing through participation in the Anomaly Detection at Multiple Scales (ADAMS) program sponsored by the U.S. Defense Advanced Research Projects Agency (DARPA) under Agreement Number W911NF-11-C-0200. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.)

[1]  Fei Wang,et al.  Efficient Document Clustering via Online Nonnegative Matrix Factorizations , 2011, SDM.

[2]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[3]  Diane J. Cook,et al.  Graph-based anomaly detection , 2003, KDD '03.

[4]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[5]  Philip S. Yu,et al.  Fast Monitoring Proximity and Centrality on Time-evolving Bipartite Graphs , 2008 .

[6]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Ambuj K. Singh,et al.  Dimensionality reduction for similarity searching in dynamic databases , 1998, SIGMOD '98.

[8]  Decision Systems.,et al.  A simple polynomial-time algorithm for convex quadratic programming , 1988 .

[9]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[10]  Silvio Lattanzi,et al.  On compressing social networks , 2009, KDD.

[11]  Petros Drineas,et al.  Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition , 2006, SIAM J. Comput..

[12]  Deepayan Chakrabarti,et al.  AutoPart: Parameter-Free Graph Partitioning and Outlier Detection , 2004, PKDD.

[13]  Raul Kompass,et al.  A Generalized Divergence Measure for Nonnegative Matrix Factorization , 2007, Neural Computation.

[14]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[15]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[16]  Ravi Kumar,et al.  Dynamics of conversations , 2010, KDD.

[17]  Philip S. Yu,et al.  Colibri: fast mining of large static and dynamic graphs , 2008, KDD.

[18]  Philip S. Yu,et al.  Proximity Tracking on Time-Evolving Bipartite Graphs , 2008, SDM.

[19]  Lawrence B. Holder,et al.  Mining for Structural Anomalies in Graph-based Data , 2007, DMIN.

[20]  Yun Chi,et al.  Evolutionary spectral clustering by incorporating temporal smoothness , 2007, KDD '07.

[21]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[22]  Fei Wang,et al.  Efficient Nonnegative Matrix Factorization with Random Projections , 2010, SDM.

[23]  Jimeng Sun,et al.  Neighborhood formation and anomaly detection in bipartite graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[24]  Jimeng Sun,et al.  Less is More: Compact Matrix Decomposition for Large Sparse Graphs , 2007, SDM.

[25]  Sushil Verma,et al.  A note on the strong polynomiality of convex quadratic programming , 1995, Math. Program..

[26]  Yehuda Koren,et al.  Modeling relationships at multiple scales to improve accuracy of large recommender systems , 2007, KDD '07.

[27]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[28]  S. N. Dorogovtsev,et al.  Evolution of networks , 2001, cond-mat/0106144.

[29]  Victoria Stodden,et al.  When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts? , 2003, NIPS.

[30]  Pauli Miettinen,et al.  Interpretable nonnegative matrix decompositions , 2008, KDD.

[31]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[32]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[33]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[34]  Martin Grohe The complexity of homomorphism and constraint satisfaction problems seen from the other side , 2007, JACM.

[35]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[36]  Christos Faloutsos,et al.  Detecting Fraudulent Personalities in Networks of Online Auctioneers , 2006, PKDD.

[37]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[38]  C. Lee Giles,et al.  Self-Organization and Identification of Web Communities , 2002, Computer.

[39]  Haesun Park,et al.  Toward Faster Nonnegative Matrix Factorization: A New Algorithm and Comparisons , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[40]  Geoffrey J. Gordon,et al.  Relational learning via collective matrix factorization , 2008, KDD.

[41]  Dimitris Achlioptas,et al.  Fast computation of low-rank matrix approximations , 2007, JACM.