Anomaly detection in smart grids with imbalanced data methods

The research of anomaly-based intrusion detection within smart grids is a current topic and is investigated by many researchers. Thus, little experience is available on how to address the problem of detecting anomalies in smart grids. Another problem emerges when one tries to use common approaches of pattern recognition. As the data in such systems is typically highly imbalanced — there are many more normal instances than attack instances — there is often a high rate of misclassification when considering the attack, or minority class. In order to study this issue, this paper investigates the use of resampling techniques for intrusion detection inside of a hierarchical, three-layer smart grid communication system using a relatively new data set called ADFA-LD (this dataset includes contemporary attacks and is well-known for evaluating the performance of anomaly-based intrusion detection systems). Results compare the performance of typical and resampled techniques, demonstrating that the use of resampling leads to improved detection of attacks with a smart grid communication system.

[1]  Victor S. Sheng,et al.  Thresholding for Making Classifiers Cost-sensitive , 2006, AAAI.

[2]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[3]  Gideon Creech,et al.  Developing a high-accuracy cross platform Host-Based Intrusion Detection System capable of reliably detecting zero-day attacks , 2014 .

[4]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ing-Ray Chen,et al.  Behavior-Rule Based Intrusion Detection Systems for Safety Critical Smart Grid Applications , 2013, IEEE Transactions on Smart Grid.

[6]  Alejandro Correa Bahnsen Example-Dependent Cost-Sensitive Classification with Applications in Financial Risk Modeling and Marketing Analytics , 2015 .

[7]  Robert C. Green,et al.  Intrusion Detection System in A Multi-Layer Network Architecture of Smart Grids by Yichi , 2015 .

[8]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[9]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[10]  Jiankun Hu,et al.  Evaluating host-based anomaly detection systems: A preliminary analysis of ADFA-LD , 2013, 2013 6th International Congress on Image and Signal Processing (CISP).

[11]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[12]  Luís Torgo,et al.  A Survey of Predictive Modeling on Imbalanced Domains , 2016, ACM Comput. Surv..

[13]  Heejo Lee,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. INVITED PAPER Cyber–Physical Security of a Smart Grid Infrastructure , 2022 .

[14]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[15]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[16]  Bart Baesens,et al.  An empirical comparison of techniques for the class imbalance problem in churn prediction , 2017, Inf. Sci..

[17]  Jiankun Hu,et al.  Generation of a new IDS test dataset: Time to retire the KDD collection , 2013, 2013 IEEE Wireless Communications and Networking Conference (WCNC).

[18]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[19]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..