Hybrid Classification of WEB Trojan Exploiting Small Volume of Labeled Data Vectors

This research paper introduces a Denoising auto encoder (Unsupervised Deep Neural Network) combined with a typical Back Propagation (BP) Artificial Neural Network (ANN), capable of efficiently detecting WEB Trojan malware. Several researchers in the literature, employ Machine Learning (ML) to detect WEB Trojans. The data used in this paper, come from the WEB security Gateway, since there is less tagged data than unlabeled ones. Based on the literature, simple Supervised Learning (SULE) is not efficient enough for this task. The algorithm proposed herein is hybrid. It employs Unsupervised Learning (UNLE) based on a Stack Denoising Auto encoder (SdAE) to pre-train the data (one layer at a time). This results in more robust feature vectors. Then, in the fine-tuning process, minor adjustments are made through Supervised Learning (SUL) based on a BP ANN. The proposed approach, ensures that the developed model, can still perform accurately, even when the training data set has a small number of tagged data vectors. This research, verifies this hybrid Deep Learning approach used for WEB Trojan detection, outperforms other common classification methods.