deepBF: Malicious URL detection using Learned Bloom Filter and Evolutionary Deep Learning

Malicious URL detection is an emerging research area due to continuous modernization of various systems, for instance, Edge Computing. In this article, we present a novel malicious URL detection technique, called deepBF (deep learning and Bloom Filter). deepBF is presented in two-fold. Firstly, we propose a learned Bloom Filter using 2-dimensional Bloom Filter. We experimentally decide the best non-cryptography string hash function. Then, we derive a modified non-cryptography string hash function from the selected hash function for deepBF by introducing biases in the hashing method and compared among the string hash functions. The modified string hash function is compared to other variants of diverse non-cryptography string hash functions. It is also compared with various filters, particularly, counting Bloom Filter, Kirsch et al., and Cuckoo Filter using various use cases. The use cases unearth weakness and strength of the filters. Secondly, we propose a malicious URL detection mechanism using deepBF. We apply the evolutionary convolutional neural network to identify the malicious URLs. The evolutionary convolutional neural network is trained and tested with malicious URL datasets. The output is tested in deepBF for accuracy. We have achieved many conclusions from our experimental evaluation and results and are able to reach various conclusive decisions which are presented in the article.

[1]  Ramachandra Raghavendra,et al.  Multi-biometric template protection based on bloom filters , 2018, Inf. Fusion.

[2]  Michael Mitzenmacher,et al.  Less hashing, same performance: Building a better Bloom filter , 2006, Random Struct. Algorithms.

[3]  Khashayar Khorasani,et al.  Deep Convolutional Neural Networks and Learning ECG Features for Screening Paroxysmal Atrial Fibrillation Patients , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[4]  Shalini Batra,et al.  Bloom filter based optimization scheme for massive data handling in IoT environment , 2017, Future Gener. Comput. Syst..

[5]  Peter J. Angeline,et al.  An evolutionary algorithm that constructs recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[6]  Peter M. Todd,et al.  Designing Neural Networks using Genetic Algorithms , 1989, ICGA.

[7]  Nen-Fu Huang,et al.  An Efficient Caching Mechanism for Network-Based URL Filtering by Multi-Level Counting Bloom Filters , 2011, 2011 IEEE International Conference on Communications (ICC).

[8]  Ali A. Ghorbani,et al.  Detecting Malicious URLs Using Lexical Analysis , 2016, NSS.

[9]  Ripon Patgiri,et al.  A Review on Role of Bloom Filter on DNA Assembly , 2019, IEEE Access.

[10]  Hak-Keung Lam,et al.  Tuning of the structure and parameters of a neural network using an improved genetic algorithm , 2003, IEEE Trans. Neural Networks.

[11]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[12]  Ripon Patgiri HFil: A High Accuracy Bloom Filter , 2019, 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[13]  Ripon Patgiri,et al.  PassDB: A password database with strict privacy protocol using 3D Bloom filter , 2020, Inf. Sci..

[14]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[15]  Wenyu Qu,et al.  Detecting superpoints through a reversible counting Bloom filter , 2010, The Journal of Supercomputing.

[16]  Ashraf Darwish,et al.  A survey of swarm and evolutionary computing approaches for deep learning , 2019, Artificial Intelligence Review.

[17]  Neeraj Kumar,et al.  BloomStore: Dynamic Bloom-Filter-based Secure Rule-Space Management Scheme in SDN , 2020, IEEE Transactions on Industrial Informatics.

[18]  Tim Kraska,et al.  The Case for Learned Index Structures , 2018 .

[19]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[20]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[21]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2001, PODC '01.

[22]  Hyesook Lim,et al.  New Approach for Efficient IP Address Lookup Using a Bloom Filter in Trie-Based Algorithms , 2016, IEEE Transactions on Computers.

[23]  Ripon Patgiri,et al.  Hunting the Pertinency of Bloom Filter in Computer Networking and Beyond: A Survey , 2019, J. Comput. Networks Commun..

[24]  Kim-Kwang Raymond Choo,et al.  Fuzzy-Folded Bloom Filter-as-a-Service for Big Data Storage in the Cloud , 2019, IEEE Transactions on Industrial Informatics.

[25]  Bin Fan,et al.  Cuckoo Filter: Practically Better Than Bloom , 2014, CoNEXT.

[26]  H BloomBurton Space/time trade-offs in hash coding with allowable errors , 1970 .

[27]  Michael Mitzenmacher,et al.  A Model for Learned Bloom Filters and Optimizing by Sandwiching , 2018, NeurIPS.

[28]  Zhenwei Dai,et al.  Adaptive Learned Bloom Filter (Ada-BF): Efficient Utilization of the Classifier , 2019, NeurIPS.

[29]  Sancho Salcedo-Sanz,et al.  An evolutionary-based hyper-heuristic approach for optimal construction of group method of data handling networks , 2013, Inf. Sci..

[30]  Jiancheng Lv,et al.  Automatically Designing CNN Architectures Using Genetic Algorithm for Image Classification , 2018, ArXiv.

[31]  Xiaonian Wang,et al.  Design of a Multiple Bloom Filter for Distributed Navigation Routing , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[32]  Cédric Lauradoux,et al.  The Power of Evil Choices in Bloom Filters , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[33]  Hayoung Byun,et al.  Ternary Bloom filter replacing counting Bloom filter , 2016, 2016 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia).

[34]  Ripon Patgiri,et al.  rDBF: A r-Dimensional Bloom Filter for massive scale membership query , 2019, J. Netw. Comput. Appl..

[35]  Deke Guo,et al.  Optimizing Bloom Filter: Challenges, Solutions, and Comparisons , 2018, IEEE Communications Surveys & Tutorials.

[36]  Naoki Shibata,et al.  Secure Payment System Utilizing MANET for Disaster Areas , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[37]  W. W. PETERSONt,et al.  Cyclic Codes for Error Detection * , 2022 .