Web robot detection with semi-supervised learning method

Web robot is an automated information gathering program that has brought a lot of problems such as information leakage, resource occupation and network security threaten. It is necessary to effectively detect and control the web access comes from web robots. Summarizes the existing categories of web robot detection method, we propose a new detection method with semi-supervised support vector machine. The experiments based on the same test data set are presented to show that the new method is superior to other robot detection methods.

[1]  Aijun An,et al.  Detecting Web Crawlers from Web Server Access Logs with Data Mining Classifiers , 2011, ISMIS.

[2]  Aijun An,et al.  Unsupervised Clustering of Web Sessions to Detect Malicious and Non-malicious Website Users , 2011, ANT/MobiWIS.

[3]  Harneet Kaur,et al.  UAC: A Lightweight and Scalable Approach to Detect Malicious Web Pages , 2014, CSOC.

[4]  Michael L. Nelson,et al.  Access patterns for robots and humans in web archives , 2013, JCDL '13.

[5]  Dong Wang,et al.  Optimizing Discriminant Model for Improved Classification of Protein , 2013 .

[6]  Marios D. Dikaiakos,et al.  Web robot detection: A probabilistic reasoning approach , 2009, Comput. Networks.

[7]  Tanvir Habib Sardar,et al.  Detection and confirmation of web robot requests for cleaning the voluminous web log data , 2014, 2014 International Conference on the IMpact of E-Technology on US (IMPETUS).

[8]  Swapna S. Gokhale,et al.  Web robot detection techniques: overview and limitations , 2010, Data Mining and Knowledge Discovery.

[9]  Young-Gab Kim,et al.  Web robot detection based on pattern-matching technique , 2012, J. Inf. Sci..

[10]  Tsuhan Chen,et al.  Malicious web content detection by machine learning , 2010, Expert Syst. Appl..

[11]  Shun-Zheng Yu,et al.  Web Robot Detection Based on Hidden Markov Model , 2006, 2006 International Conference on Communications, Circuits and Systems.

[12]  S. Sathiya Keerthi,et al.  Optimization Techniques for Semi-Supervised Support Vector Machines , 2008, J. Mach. Learn. Res..

[13]  Sungdeok Cha,et al.  Web Robot Detection based on Monotonous Behavior , 2012 .

[14]  Swapna S. Gokhale,et al.  Detecting Web Robots Using Resource Request Patterns , 2012, 2012 11th International Conference on Machine Learning and Applications.

[15]  Aijun An,et al.  Detection of malicious and non-malicious website visitors using unsupervised neural network learning , 2013, Appl. Soft Comput..

[16]  Aijun An,et al.  Feature evaluation for web crawler detection with data mining techniques , 2012, Expert Syst. Appl..