InVEST: Intelligent visual email search and triage

Abstract Large email data sets are often the focus of criminal and civil investigations. This has created a daunting task for investigators due to the extraordinary size of many of these collections. Our work offers an interactive visual analytic alternative to the current, manually intensive methodology used in the search for evidence in large email data sets. These sets usually contain many emails which are irrelevant to an investigation, forcing investigators to manually comb through information in order to find relevant emails, a process which is costly in terms of both time and money. To aid the investigative process we combine intelligent preprossessing, a context aware visual search, and a results display that presents an integrated view of diverse information contained within emails. This allows an investigator to reduce the number of emails that need to be viewed in detail without the current tedious manual search and comb process.

[1]  John T. Stasko,et al.  Toward a Deeper Understanding of the Role of Interaction in Information Visualization , 2007, IEEE Transactions on Visualization and Computer Graphics.

[2]  Jafar Adibi,et al.  Discovering important nodes through graph entropy the case of Enron email database , 2005, LinkKDD '05.

[3]  Fernanda B. Viégas,et al.  Visualizing email content: portraying relationships from conversational histories , 2006, CHI.

[4]  Nicole Beebe,et al.  Clustering digital forensic string search output , 2014, Digit. Investig..

[5]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[6]  Chris North,et al.  Multi-model semantic interaction for text analytics , 2014, 2014 IEEE Conference on Visual Analytics Science and Technology (VAST).

[7]  Hua Li,et al.  Adding Semantics to Email Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[8]  Chris Shaw,et al.  EmailTime: Visual analytics of emails , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[9]  Bernard Kerr Thread Arcs: an email thread visualization , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[10]  Blaine Nelson,et al.  Analyzing Behavioral Features for Email Classification , 2005, CEAS.

[11]  Hans-Peter Kriegel,et al.  Recursive pattern: a technique for visualizing very large amounts of data , 1995, Proceedings Visualization '95.

[12]  John T. Stasko,et al.  Combining Computational Analyses and Interactive Visualization for Document Exploration and Sensemaking in Jigsaw , 2013, IEEE Transactions on Visualization and Computer Graphics.

[13]  Daniel A. Keim,et al.  Literature Fingerprinting: A New Method for Visual Literary Analysis , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[14]  John T. Stasko,et al.  How Can Visual Analytics Assist Investigative Analysis? Design Implications from an Evaluation , 2011, IEEE Transactions on Visualization and Computer Graphics.

[15]  Ted Pedersen,et al.  Name Discrimination and Email Clustering using Unsupervised Clustering and Labeling of Similar Contexts , 2005, IICAI.

[16]  Mark John Taylor,et al.  A Framework for the Forensic Investigation of Unstructured Email Relationship Data , 2011, Int. J. Digit. Crime Forensics.

[17]  Mark John Taylor,et al.  Forensic triage of email network narratives through visualisation , 2014, Inf. Manag. Comput. Secur..

[18]  Daniel A. Keim,et al.  Visual Analytics: Definition, Process, and Challenges , 2008, Information Visualization.

[19]  Nicole Beebe,et al.  Post-retrieval search hit clustering to improve information retrieval effectiveness: Two digital forensics case studies , 2011, Decis. Support Syst..