Data Loss Prevention Using Document Semantic Signature

Data protection and insider threat detection and prevention are significant steps that organizations should take to enhance their internal security. Data loss prevention (DLP) is an emerging mechanism that is currently being used by organizations to detect and block unauthorized data transfers. Existing DLP approaches, however, face several practical challenges that limit their effectiveness. In this chapter, by extracting and analyzing document content semantic, we present a new DLP approach that addresses many existing challenges. We introduce the notion of a document semantic signature as a summarized representation of the document semantic. We show that the semantic signature can be used to detect a data leak by experimenting on a public dataset, yielding very encouraging detection effectiveness results including on average a false positive rate (FPR) of 6.71% and on average a detection rate (DR) of 84.47%.

[1]  Rob Johnson,et al.  Text Classification for Data Loss Prevention , 2011, PETS.

[2]  D. Richard Kuhn,et al.  Data Loss Prevention , 2010, IT Professional.

[3]  Biswanath Mukherjee,et al.  SIDD: A Framework for Detecting Sensitive Data Exfiltration by an Insider Attack , 2009 .

[4]  Atulay Mahajan,et al.  The Malicious Insiders Threat in the Cloud , 2015 .

[5]  Akaninyene Walter Udoeyop,et al.  Cyber Profiling for Insider Threat Detection , 2010 .

[6]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[7]  Rada Mihalcea,et al.  Measuring the Semantic Similarity of Texts , 2005, EMSEE@ACL.

[8]  Harini Ragavan Insider Threat Mitigation Models Based on Thresholds and Dependencies , 2012 .

[9]  De Xu,et al.  Concept Vector for Similarity Measurement Based on Hierarchical Domain Structure , 2011, Comput. Informatics.

[10]  Enrico Motta,et al.  Semantically enhanced Information Retrieval: An ontology-based approach , 2011, J. Web Semant..

[11]  Stéphane M. Meystre,et al.  Automated concept and relationship extraction for the semi-automated ontology management (SEAM) system , 2015, J. Biomed. Semant..

[12]  Richard E. Overill,et al.  Insider threats in corporate environments: a case study for data leakage prevention , 2012, BCI '12.

[13]  Dimitris Gritzalis,et al.  An Insider Threat Prediction Model , 2010, TrustBus.

[14]  Seref Sagiroglu,et al.  A Turkish language based data leakage prevention system , 2017, 2017 5th International Symposium on Digital Forensic and Security (ISDFS).