A Convolution-Based System for Malicious URLs Detection

Since the web service is essential in daily lives, cyber security becomes more and more important in this digital world. Malicious Uniform Resource Locator (URL) is a common and serious threat to cybersecurity. It hosts unsolicited content and lure unsuspecting users to become victim of scams, such as theft of private information, monetary loss, and malware installation. Thus, it is imperative to detect such threats. However, traditional approaches for malicious URLs detection that based on the blacklists are easy to be bypassed and lack the ability to detect newly generated malicious URLs. In this paper, we propose a novel malicious URL detection method based on deep learning model to protect against web attacks. Specifically, we firstly use auto-encoder to represent URLs. Then, the represented URLs will be input into a proposed composite neural network for detection. In order to evaluate the proposed system, we made extensive experiments on HTTP CSIC2010 dataset and a dataset we collected, and the experimental results show the effectiveness of the proposed approach.

[1]  Xiaojiang Du,et al.  Internet Protocol Television (IPTV): The Killer Application for the Next-Generation Internet , 2007, IEEE Communications Magazine.

[2]  Yan Jia,et al.  Bidirectional self-adaptive resampling in internet of things big data learning , 2018, Multimedia Tools and Applications.

[3]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[4]  Baoyu Ma,et al.  A Novel Ensemble Learning Algorithm Based on D-S Evidence Theory for IoT Security , 2015 .

[5]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[6]  Xianzhi Wang,et al.  Trust architecture and reputation evaluation for internet of things , 2018, J. Ambient Intell. Humaniz. Comput..

[7]  Hitoshi Iyatomi,et al.  Web application firewall using character-level convolutional neural network , 2018, 2018 IEEE 14th International Colloquium on Signal Processing & Its Applications (CSPA).

[8]  Shen Su,et al.  Real-Time Lateral Movement Detection Based on Evidence Reasoning Network for Edge Computing Environment , 2019, IEEE Transactions on Industrial Informatics.

[9]  Mukesh Singhal,et al.  Security in wireless sensor networks , 2008, Wirel. Commun. Mob. Comput..

[10]  Duc Le,et al.  An Unsupervised Learning Approach for Network and System Analysis , 2017 .

[11]  Zhaoquan Gu,et al.  Automatic Non-Taxonomic Relation Extraction from Big Data in Smart City , 2018, IEEE Access.

[12]  Jinqiao Shi,et al.  Toward a Comprehensive Insight Into the Eclipse Attacks of Tor Hidden Services , 2019, IEEE Internet of Things Journal.

[13]  Hui Lu,et al.  Analysis of LSTM-RNN Based on Attack Type of KDD-99 Dataset , 2018, ICCCS.

[14]  Jun Zhou,et al.  POSTER: A PU Learning based System for Potential Malicious URL Detection , 2017, CCS.

[15]  Xiaojiang Du,et al.  A survey of key management schemes in wireless sensor networks , 2007, Comput. Commun..

[16]  Shen Su,et al.  A Privacy Preserving Scheme for Nearest Neighbor Query , 2018, Sensors.

[17]  Xiaoxia Yin,et al.  A Real-Time Correlation of Host-Level Events in Cyber Range Service for Smart Campus , 2018, IEEE Access.

[18]  Feng Jiang,et al.  A Data Leakage Prevention Method Based on the Reduction of Confidential and Context Terms for Smart Mobile Devices , 2018, Wirel. Commun. Mob. Comput..

[19]  Konrad Rieck,et al.  Detecting Unknown Network Attacks Using Language Models , 2006, DIMVA.

[20]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[21]  Roy George,et al.  Efficient detection of anomolous HTTP payloads in networks , 2016, SoutheastCon 2016.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Yonghui Wu,et al.  Exploring the Limits of Language Modeling , 2016, ArXiv.

[24]  Danny Hendler,et al.  Detecting Malicious PowerShell Commands using Deep Neural Networks , 2018, AsiaCCS.

[25]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[26]  Mohsen Guizani,et al.  An effective key management scheme for heterogeneous sensor networks , 2007, Ad Hoc Networks.

[27]  Mohan Li,et al.  Answering the Min-Cost Quality-Aware Query on Multi-sources in Sensor-Cloud Systems , 2018, SpaCCS.

[28]  Wei Ye,et al.  Anomaly-Based Web Attack Detection: A Deep Learning Approach , 2017, ICNCC.

[29]  Mohsen Guizani,et al.  A data-driven method for future Internet route decision modeling , 2019, Future Gener. Comput. Syst..