Community Trend Outlier Detection Using Soft Temporal Pattern Mining

Numerous applications, such as bank transactions, road traffic, and news feeds, generate temporal datasets, in which data evolves continuously. To understand the temporal behavior and characteristics of the dataset and its elements, we need effective tools that can capture evolution of the objects. In this paper, we propose a novel and important problem in evolution behavior discovery. Given a series of snapshots of a temporal dataset, each of which consists of evolving communities, our goal is to find objects which evolve in a dramatically different way compared with the other community members. We define such objects as community trend outliers. It is a challenging problem as evolutionary patterns are hidden deeply in noisy evolving datasets and thus it is difficult to distinguish anomalous objects from normal ones. We propose an effective two-step procedure to detect community trend outliers. We first model the normal evolutionary behavior of communities across time using soft patterns discovered from the dataset. In the second step, we propose effective measures to evaluate chances of an object deviating from the normal evolutionary patterns. Experimental results on both synthetic and real datasets show that the proposed approach is highly effective in discovering interesting community trend outliers.

[1]  Bettina Speckmann,et al.  Efficient detection of motion patterns in spatio-temporal data sets , 2004, GIS '04.

[2]  Hui Xiong,et al.  Top-Eye: top-k evolving trajectory outlier detection , 2010, CIKM.

[3]  Rajeev Raman,et al.  Mining sequential patterns from probabilistic databases , 2011, Knowledge and Information Systems.

[4]  Vladimir Pavlovic,et al.  Discovering clusters in motion time-series data , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  Hans-Peter Kriegel,et al.  Probabilistic frequent itemset mining in uncertain databases , 2009, KDD.

[6]  Philip S. Yu,et al.  Outlier detection in graph streams , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[7]  Bo Zhao,et al.  Community evolution detection in dynamic heterogeneous information networks , 2010, MLG '10.

[8]  Fabrizio Angiulli,et al.  Detecting distance-based outliers in streams of data , 2007, CIKM '07.

[9]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[10]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[11]  M. Otto,et al.  Outliers in Time Series , 1972 .

[12]  Philip S. Yu,et al.  Outlier detection for high dimensional data , 2001, SIGMOD '01.

[13]  Jae-Gil Lee,et al.  Trajectory Outlier Detection: A Partition-and-Detect Framework , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[14]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[15]  Yizhou Sun,et al.  Integrating community matching and outlier detection for mining evolutionary community outliers , 2012, KDD.

[16]  Srinivasan Parthasarathy,et al.  LOADED: link-based outlier and anomaly detection in evolving data sets , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[17]  Yizhou Sun,et al.  Graph-based Consensus Maximization among Multiple Supervised and Unsupervised Models , 2009, NIPS.

[18]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[19]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[20]  Charu C. Aggarwal,et al.  On Abnormality Detection in Spuriously Populated Data Streams , 2005, SDM.

[21]  Mubarak Shah,et al.  Learning object motion patterns for anomaly detection and improved object detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Jiawei Han,et al.  Swarm: Mining Relaxed Temporal Moving Object Clusters , 2010, Proc. VLDB Endow..

[23]  Philip S. Yu,et al.  Outlier Detection with Uncertain Data , 2008, SDM.

[24]  Yizhou Sun,et al.  On community outliers and their efficient detection in information networks , 2010, KDD.