Malware is one of the most serious threats on the Internet. Countermeasures have been developed, but still many users are infected. Detecting and preventing communication by infected users from the network side would effectively mitigate the threats of malware. For this, we need to collect information about the destinations or payloads of malware communication. Dynamic analysis is usually used to obtain this information. Since some malware requires access to the Internet, e.g., bots and downloaders, the dynamic analysis environment must connect to the Internet. Recently developed malware communicates with remote hosts by HTTP protocol for not only command-and-control (C&C) or malware downloading but also attacks. For secure dynamic analysis in an environment with Internet connectivity, it is necessary to determine if the destination is for C&C or malware downloading and to only allow connection to these servers. We propose a dynamic analysis system with Internet connection that controls HTTP communication by using a search engine. To control HTTP connections, we built a classifier using a support vector machine based on the assumption that sites for C&C or malware downloading, for example, are harder to find and have a lower backlink count than benign sites. Our classifier, which is trained on popular URLs and URLs based on malware analysis, has 99.69% cross-validation accuracy. We evaluated other known benign popular sites with our classifier, and they were all classified as benign. Our evaluation confirms that our classifier can distinguish benign sites, so the proposed dynamic analysis system is effective for safe analysis in an environment with Internet connection.
[1]
Thorsten Holz,et al.
Rishi: Identify Bot Contaminated Hosts by IRC Nickname Evaluation
,
2007,
HotBots.
[2]
Neil Daswani,et al.
The Anatomy of Clickbot.A
,
2007,
HotBots.
[3]
Felix C. Freiling,et al.
Toward Automated Dynamic Malware Analysis Using CWSandbox
,
2007,
IEEE Secur. Priv..
[4]
Engin Kirda,et al.
A View on Current Malware Behaviors
,
2009,
LEET.
[5]
Hassen Saïdi,et al.
A Foray into Conficker's Logic and Rendezvous Points
,
2009,
LEET.
[6]
Katsunari Yoshioka,et al.
Sandbox Analysis with Controlled Internet Connection for Observing Temporal Changes of Malware Behavior
,
2009
.
[7]
Mitsuaki Akiyama,et al.
Design and Implementation of High Interaction Client Honeypot for Drive-by-Download Attacks
,
2010,
IEICE Trans. Commun..
[8]
Takeshi Yagi,et al.
Design of Provider-Provisioned Website Protection Scheme against Malware Distribution
,
2010,
IEICE Trans. Commun..
[9]
Chih-Jen Lin,et al.
LIBSVM: A library for support vector machines
,
2011,
TIST.