TargetVue: Visual Analysis of Anomalous User Behaviors in Online Communication Systems

Users with anomalous behaviors in online communication systems (e.g. email and social medial platforms) are potential threats to society. Automated anomaly detection based on advanced machine learning techniques has been developed to combat this issue; challenges remain, though, due to the difficulty of obtaining proper ground truth for model training and evaluation. Therefore, substantial human judgment on the automated analysis results is often required to better adjust the performance of anomaly detection. Unfortunately, techniques that allow users to understand the analysis results more efficiently, to make a confident judgment about anomalies, and to explore data in their context, are still lacking. In this paper, we propose a novel visual analysis system, TargetVue, which detects anomalous users via an unsupervised learning model and visualizes the behaviors of suspicious users in behavior-rich context through novel visualization designs and multiple coordinated contextual views. Particularly, TargetVue incorporates three new ego-centric glyphs to visually summarize a user's behaviors which effectively present the user's communication activities, features, and social interactions. An efficient layout method is proposed to place these glyphs on a triangle grid, which captures similarities among users and facilitates comparisons of behaviors of different users. We demonstrate the power of TargetVue through its application in a social bot detection challenge using Twitter data, a case study based on email records, and an interview with expert users. Our evaluation shows that TargetVue is beneficial to the detection of users with anomalous communication behaviors.

[1]  Helwig Hauser,et al.  Outlier-Preserving Focus+Context Visualization in Parallel Coordinates , 2006, IEEE Transactions on Visualization and Computer Graphics.

[2]  Ben Shneiderman,et al.  Using rhythms of relationships to understand e-mail archives , 2006, J. Assoc. Inf. Sci. Technol..

[3]  Thomas Ertl,et al.  Spatiotemporal anomaly detection through visual analysis of geolocated Twitter messages , 2012, 2012 IEEE Pacific Visualization Symposium.

[4]  Xenofontas A. Dimitropoulos,et al.  Histogram-based traffic anomaly detection , 2009, IEEE Transactions on Network and Service Management.

[5]  Ana-Maria Popescu,et al.  A Machine Learning Approach to Twitter User Classification , 2011, ICWSM.

[6]  Fei Wang,et al.  SocialHelix: visual analysis of sentiment divergence in social media , 2015, J. Vis..

[7]  Don R. Hush,et al.  A Classification Framework for Anomaly Detection , 2005, J. Mach. Learn. Res..

[8]  Álvaro Herrero,et al.  Neural visualization of network traffic data for intrusion detection , 2011, Appl. Soft Comput..

[9]  P. Lazarsfeld,et al.  Personal Influence: The Part Played by People in the Flow of Mass Communications , 1956 .

[10]  Judith S. Donath,et al.  PeopleGarden: creating data portraits for users , 1999, UIST '99.

[11]  Kwan-Liu Ma,et al.  Case study: Interactive visualization for Internet security , 2002, IEEE Visualization, 2002. VIS 2002..

[12]  Yu-Ru Lin,et al.  UnTangle Map: Visual Analysis of Probabilistic Multi-Label Data , 2016, IEEE Transactions on Visualization and Computer Graphics.

[13]  Yang Zhang,et al.  Modeling user posting behavior on social media , 2012, SIGIR '12.

[14]  Yale Song,et al.  #FluxFlow: Visual Analysis of Anomalous Information Spreading on Social Media , 2014, IEEE Transactions on Visualization and Computer Graphics.

[15]  H. Kile,et al.  Bandwidth Selection in Kernel Density Estimation , 2010 .

[16]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[17]  Reza Zafarani,et al.  Understanding User Migration Patterns in Social Media , 2011, AAAI.

[18]  Klaus-Robert Müller,et al.  Visualization of anomaly detection using prediction sensitivity , 2005, Sicherheit.

[19]  Benjamin Jotham Fry,et al.  Organic information design , 2000 .

[20]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[21]  Martin Wattenberg,et al.  Studying cooperation and conflict between authors with history flow visualizations , 2004, CHI.

[22]  Eser Kandogan,et al.  Visualizing multi-dimensional clusters, trends, and outliers using star coordinates , 2001, KDD '01.

[23]  Jacob Ratkiewicz,et al.  Detecting and Tracking Political Abuse in Social Media , 2011, ICWSM.

[24]  Fernanda B. Viégas,et al.  Visualizing email content: portraying relationships from conversational histories , 2006, CHI.

[25]  Deborah A. Frincke,et al.  Intrusion and Misuse Detection in Large-Scale Systems , 2002, IEEE Computer Graphics and Applications.

[26]  Salvatore J. Stolfo,et al.  Email archive analysis through graphical visualization , 2004, VizSEC/DMSEC '04.

[27]  Alberto Muñoz,et al.  Self-organizing maps for outlier detection , 1998, Neurocomputing.

[28]  Salvatore J. Stolfo,et al.  A Geometric Framework for Unsupervised Anomaly Detection , 2002, Applications of Data Mining in Computer Security.

[29]  Myron Wish,et al.  Three-Way Multidimensional Scaling , 1978 .

[30]  A. Adithya Parallel Coordinates , 2015 .

[31]  Stefan Axelsson Visualisation for Intrusion Detection , 2003, ESORICS.

[32]  Juan-Zi Li,et al.  Understanding retweeting behaviors in social networks , 2010, CIKM.

[33]  Yu-Ru Lin,et al.  Episogram: Visual Summarization of Egocentric Social Interactions , 2016, IEEE Computer Graphics and Applications.

[34]  Wei-Yang Lin,et al.  Intrusion detection by machine learning: A review , 2009, Expert Syst. Appl..

[35]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[36]  Xiaohua Sun,et al.  Whisper: Tracing the Spatiotemporal Process of Information Diffusion in Real Time , 2012, IEEE Transactions on Visualization and Computer Graphics.

[37]  Peter Kulchyski and , 2015 .

[38]  Nina Simon The Participatory Museum , 2010 .

[39]  P. Lazarsfeld,et al.  Personal Influence: The Part Played by People in the Flow of Mass Communications , 1956 .

[40]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[41]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[42]  Alessandro Vespignani,et al.  Modeling Users' Activity on Twitter Networks: Validation of Dunbar's Number , 2011, PloS one.

[43]  Yale Song,et al.  One-Class Conditional Random Fields for Sequential Anomaly Detection , 2013, IJCAI.

[44]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[45]  P. Lazarsfeld,et al.  6. Katz, E. Personal Influence: The Part Played by People in the Flow of Mass Communications , 1956 .

[46]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[47]  Eamonn J. Keogh,et al.  Visualizing and Discovering Non-Trivial Patterns in Large Time Series Databases , 2005, Inf. Vis..

[48]  Les Carr,et al.  Identifying communicator roles in twitter , 2012, WWW.