Discriminating Unknown Software Using Distance Model

Crypto-ransomware is a class of malware that encrypt their victim’s data and only return the decryption key in exchange for a ransom. In a previous work, we have yet designed a solution able to detect any ciphering of files using statistical estimator. Once detected, a pop up requests the user to verify if that operation is allowed on not. To improve our tool, automation is needed. In this paper, an anomaly detection mechanism to determine if a suspected group of threads is an authorized cryptographic software or a malicious code is presented. The effectiveness of our solution to correctly distinguish between valid programs and ransomware is evaluated using a string analysis. The tf-idf metric is used to choose the most pertinent features. The distance of a candidate software with a vector representing the allowed cryptographic software is measured. If the distance exceeds a threshold, the suspected process is flagged as a ransomware. We have evaluated our approach with the samples provided by open databases and executed on our bare metal platform.