MATATABI: Multi-layer Threat Analysis Platform with Hadoop

Threat detection and analysis are indispensable processes in today's cyberspace, but current state of the art threat detection is still limited to specific aspects of modern malicious activities due to the lack of information to analyze. By measuring and collecting various types of data, from traffic information to human behavior, at different vantage points for a long duration, the viewpoint seems to be helpful to deeply inspect threats, but faces scalability issues as the amount of collected data grows, since more computational resources are required for the analysis. In this paper, we report our experience from operating the Hadoop platform, called MATATABI, for threat detections, and present the micro-benchmarks with four different backends of data processing in typical use cases such as log data and packet trace analysis. The benchmarks demonstrate the advantages of distributed computation in terms of performance. Our extensive use cases of analysis modules showcase the potential benefit of deploying our threat analysis platform.

[1]  Youngseok Lee,et al.  Toward scalable internet traffic measurement and analysis with Hadoop , 2013, CCRV.

[2]  Youngseok Lee,et al.  Detecting DDoS attacks with Hadoop , 2011, CoNEXT '11 Student.

[3]  George Bebis,et al.  A supervised machine learning approach to classify host roles on line using sFlow , 2013, HPPN '13.

[4]  Kensuke Fukuda,et al.  Hashdoop: A MapReduce framework for network anomaly detection , 2014, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[5]  Christian Rossow,et al.  Amplification Hell: Revisiting Network Protocols for DDoS Abuse , 2014, NDSS.

[6]  Zhiwei Xu,et al.  RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[7]  Jin Cao,et al.  Identifying suspicious activities through DNS failure graph analysis , 2010, The 18th IEEE International Conference on Network Protocols.

[8]  Daisuke Miyamoto,et al.  An Evaluation of Machine Learning-Based Methods for Detection of Phishing Sites , 2008, ICONIP.

[9]  Takeshi Takahashi,et al.  A Measurement Study of Open Resolvers and DNS Server Version , 2013 .

[10]  Kensuke Fukuda,et al.  MAWILab: combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking , 2010, CoNEXT.

[11]  cyberdetective Council of Europe Convention on Cybercrime , 2007 .

[12]  Amir Herzberg,et al.  Fragmentation Considered Poisonous, or: One-domain-to-rule-them-all.org , 2013, 2013 IEEE Conference on Communications and Network Security (CNS).