X-ANOVA ranked features for Android malware analysis

The proposed framework represents a static analysis framework to classify the Android malware. From each Android .apk file, three distinct features likely (a) opcodes (b) methods and (c) permissions are extracted. Analysis of Variance (X-ANOVA) is used to rank features that have high difference in variance in malware and benign training set. To achieve this conventional ANOVA was modified; and a novel technique referred to us as X-ANOVA is proposed. Especially, X-ANOVA is utilized to reduce the dimensions of large feature space in order to minimize classification error and processing overhead incurred during the learning phase. Accuracy of the proposed system is computed using three classifiers (J48, ADABoostM1, RandomForest) and the performance is compared with voted classification approach. An overall accuracy of 88.30% with opcodes, 87.81% with method and an accuracy of 90.47% is obtained considering permission as features, using independent classifiers. However, using voted classification approach, an accuracy of 88.27% and 87.53% are obtained respectively for features like opcodes and methods. Also, an improved accuracy of 90.63% was ascertained considering permissions. Initial results are promising which demonstrate that the proposed approach can be used to assist mobile antiviruses.