Detecting Intruders by User File Access Patterns

Our society is facing a growing threat from data breaches where confidential information is stolen from computer servers. In order to steal data, hackers must first gain entry into the targeted systems. Commercial off-the-shelf intrusion detection systems are unable to defend against the intruders effectively. This research uses cyber behavior analytics to study and report how anomalies compare to normal behavior. In this paper, we present methods based on machine learning algorithms to detect intruders based on the file access patterns within a user file directory. We proposed a set of behavioral features of the user’s file access patterns in a file system. We validate the effectiveness of the features by conducting experiments on an existing file system dataset with four classification algorithms. To limit the false alarms, we trained and tested the classifiers by optimizing the performance within the lower range of the false positive rate. The results from our experiments show that our approach was able to detect intruders with a 0.94 F1 score and false positive rate of less than 3%.

[1]  Andriy I. Bandos,et al.  On the use of partial area under the ROC curve for comparison of two diagnostic tests , 2015, Biometrical journal. Biometrische Zeitschrift.

[2]  Carla E. Brodley,et al.  User re-authentication via mouse movements , 2004, VizSEC/DMSEC '04.

[3]  Bhumika Gupta,et al.  Analysis of Various Decision Tree Algorithms for Classification in Data Mining , 2017 .

[4]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[7]  Paul Helman,et al.  An immunological approach to change detection: algorithms, analysis and implications , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[8]  Roy A. Maxion,et al.  Why Did My Detector Do That?! - Predicting Keystroke-Dynamics Error Rates , 2010, RAID.

[9]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[10]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[11]  C. Metz,et al.  A receiver operating characteristic partial area index for highly sensitive diagnostic tests. , 1996, Radiology.

[12]  Luis A. Trejo,et al.  Towards Building a Masquerade Detection Method Based on User File System Navigation , 2011, MICAI.

[13]  Raúl Monroy,et al.  Towards a Masquerade Detection System Based on User's Tasks , 2014, RAID.

[14]  Salvatore J. Stolfo,et al.  Anomaly Detection in Computer Security and an Application to File System Accesses , 2005, ISMIS.

[15]  Hua Wang,et al.  Gradient Correlation: Are Ensemble Classifiers More Robust Against Evasion Attacks in Practical Settings? , 2018, WISE.

[16]  Luis A. Trejo,et al.  The Windows-Users and -Intruder simulations Logs dataset (WUIL): An experimental framework for masquerade detection mechanisms , 2014, Expert Syst. Appl..

[17]  J. Yuill,et al.  Honeyfiles: deceptive files for intrusion detection , 2004, Proceedings from the Fifth Annual IEEE SMC Information Assurance Workshop, 2004..

[18]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[19]  Luis A. Trejo,et al.  Temporal and Spatial Locality: An Abstraction for Masquerade Detection , 2016, IEEE Transactions on Information Forensics and Security.

[20]  Steven L. Salzberg,et al.  Book Review: C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993 , 1994, Machine Learning.

[21]  Thomas Lengauer,et al.  Permutation importance: a corrected feature importance measure , 2010, Bioinform..

[22]  Stefano Zanero Behavioral Intrusion Detection , 2004, ISCIS.

[23]  Shou-Hsuan Stephen Huang,et al.  Detecting Stepping-Stone Connection Using Association Rule Mining , 2009, 2009 International Conference on Availability, Reliability and Security.

[24]  A. Karr,et al.  Computer Intrusion: Detecting Masquerades , 2001 .

[25]  Salvatore J. Stolfo,et al.  A Geometric Framework for Unsupervised Anomaly Detection , 2002, Applications of Data Mining in Computer Security.

[26]  Shou-Hsuan Stephen Huang,et al.  User Behavior Analysis in Masquerade Detection Using Principal Component Analysis , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[27]  Harold S. Javitz,et al.  The SRI IDES statistical anomaly detector , 1991, Proceedings. 1991 IEEE Computer Society Symposium on Research in Security and Privacy.

[28]  Salvatore J. Stolfo,et al.  Baiting Inside Attackers Using Decoy Documents , 2009, SecureComm.

[29]  T. Therneau,et al.  An Introduction to Recursive Partitioning Using the RPART Routines , 2015 .

[30]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[31]  Shou-Hsuan Stephen Huang,et al.  Mining TCP/IP packets to detect stepping-stone intrusion , 2007, Comput. Secur..

[32]  Teresa F. Lunt,et al.  A survey of intrusion detection techniques , 1993, Comput. Secur..

[33]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1987, IEEE Transactions on Software Engineering.

[34]  Malek Ben Salem,et al.  Modeling User Search Behavior for Masquerade Detection , 2011, RAID.

[35]  Bertrand Michel,et al.  Correlation and variable importance in random forests , 2013, Statistics and Computing.