Robust malware detection with Dual-Lane AdaBoost

As an effective algorithm that integrates weak learners into a strong one, AdaBoost has found its application in various fields. Traditional AdaBoost works under the supervised learning scenario. Typically, with a limited number of labeled instances available, the learning performance is jeopardized. In this paper, we propose a novel Dual-Lane AdaBoost algorithm, which introduces semi-supervised learning into AdaBoost. On one hand, weak learners pass the weights on the labeled instances to the subsequent ones. On the other hand, the unlabeled instances with high confidence are recommended from one weak learner to another. From the perspective of information flow, we establish a dual-lane path between the weak learners. In this way, both the labeled and the unlabeled instances are fully explored and exploited. Consequently, the integrated strong learner can be remarkably improved. Experimental results on the malware dataset demonstrate the effectiveness of the proposed algorithm.