Malware detection on android smartphones using keywords vector and SVM

With the development of smart phones, more and more mobile phone malwares have came out in the market especially on the popular platforms such as Android, which can potentially cause harm to users' information. But how to effectively detect the new malwares and malicious software variants has been a difficult problem. In view of the traditional feature extraction method based on binary program, this paper presents a method for feature extraction of JAVA source code. The method uses the Keywords Correlation Distance to compute the correlation between key codes such as API calls, Android permissions, the common parameters, and the common key words in Android malware source code. Then SVM is applied to make the system gain to accommodate the function of the new malicious software sample, so as to detect new malicious software and existing malwares. This method is different from the conventional methods which are based on the context of the text. This method combines the characteristics of the malicious software categories and operating environment to record the behavior of the malicious software. Experiments show that the method is efficient and effective in detecting malwares on Android platform.