Security data visualization

The objective of this paper is to provide guidelines on information security data visualization and insights with repeatable process and examples on visualizing (communicating) information security data. Security data visualization can be used in many areas in information security. Security metrics, Security monitoring, anomaly detection, forensics, and malware analysis are examples where security data visualization can play a vital role and make us better security professionals. Security data visualization also plays key role in emerging fields such as data science, machine learning, and exploratory data analytics. There are many uses for security data visualization; so, in order to cover key aspects the paper is categorized in to two parts. The first category is communicating value. There is a well-known proverb “a picture is worth a thousand words” (Piqua Leader-Dispatch, One Look Is Worth A Thousand Words, 1913, p. 2) which explains this. The problem with traditional metrics is numbers and tables can be daunting and details can be missed easily. Visualizing it will enable the security team to highlight the salient points in the data. Security data visualization enables you to tell a story with the data. Information security is becoming a common topic in boardroom discussions and it is becoming more and more important that the value of information security is communicated to business leaders. The second category is finding anomalies using security data visualization. One of the key strengths of security teams is access to enterprise log data, meta-data, network traffic data, and netflow data. The challenge is finding and isolating the bad actors from legitimate traffic. The human mind, by evolution, is trained to identify patterns and anomalies using visualization. Security professionals can benefit by visualizing enterprise data to find anomalies and identify patterns which will be helpful in isolating events which might indicate compromise. Hopefully some of the examples will be useful to generate more ideas in this space and will be a valuable resource for all Information Security practitioners. Once security professionals get an understanding of using security data visualization it will open a whole new world and there is a possibility that this knowledge of security data science will have significant improvement on information security tasks. Security Data Visualization pingbalaji@gmail.com 1.0 Introduction Security data visualization can be used in many areas in information security. Security metrics, Security monitoring, anomaly detection, forensics, and malware analysis are examples where security data visualization can play a vital role and make us better security professionals. Till now security professionals were able to survive with Microsoft Excel and similar tools without in-depth knowledge in security data visualization. But security data visualization is becoming extremely important due to big data, machine learning and exploratory data analytics. Due to the volume of data in big data it is extremely impossible to find anomalies using traditional methods. First thing to do after a statistical computation is to understand the data visually. Recent generations of SIEM log collection and correlation solutions use big data analytics. Security data visualization plays a very vital part in analyzing the big data. Data science field is evolving at a rapid pace. Data visualization is important component of data science. Botnet Visualization Microsoft’s Digital Crimes Unit tapped The Office for Creative Research, a multidisciplinary digital design group based in New York, to come up with new ways of looking at one particular threat: botnets, the global networks of infected computers that cyber criminals enlist to do their bidding. OCR came up with a prototype tool called Specimen Box. Specimen Box offers many views including live display of botnet activity “which can be used to analyze botnet data” ("#005: The Sight and Sound of CyberCrime", o-c-r.org, 2014, para. 3). Security Data Visualization pingbalaji@gmail.com Reverse Engineering Security data visualization is used more and more in reverse engineering. “In this engaging TED(TED is a platform for ideas worth spreading http://www.ted.com/) talk, Chris Domas shows how researchers use pattern recognition and reverse engineering (and pull a few allnighters) using visualization to understand a chunk of binary code whose purpose and contents they don't know.”( Domas, C. (n.d.). The 1s and 0s behind cyber warfare. Retrieved December 15, 2014, from http://www.ted.com/talks/chris_domas_the_1s_and_0s_behind_cyber_warfare, para. 1) Currently the information security practitioners are just scratching the surface in this area, additional security data visualization magic is captured in Appendix A for inspiring and invoking the curiosity and awe in security practitioners for utilizing the full potential of security data visualization in information security day-to-day jobs. Hopefully some of the examples will be useful to generate more ideas in this space and will be a valuable skill for all Information Security practitioners. Once security practitioners get an understanding of using security data visualization it will open a whole new world and there is a possibility that this knowledge of security data science will have significant improvement on information security tasks. Security Data Visualization pingbalaji@gmail.com 2.0 Security Data Visualization Skills Data science and security visualization require the skills described in the Venn diagram. It is the space where the hacking skills, statistical knowledge and domain knowledge meet. (Conway, D. (n.d.). The Data Science Venn Diagram. Retrieved November 29, 2014, from http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram) Substantive Expertise – This is the security domain knowledge, which will enable the security practitioner to understand the data, determine what is expected and find anomalies or metrics from visualization. Hacking Skills – Hacking skills are the skills from a data scientist language required for working with massive amount of data that should be acquired, cleaned and sanitized. Math & Statistics Knowledge – This knowledge is critical to understand which tools to use, understand the spread and other characteristics to derive insight from the data. Security Data Visualization pingbalaji@gmail.com Security practitioners will be comfortable with domain knowledge and hacking skills. Statistics knowledge is one aspect that security practitioners have to understand to gain insight from data and also to ask the right questions to derive the right security visualization. One resource for statistical knowledge is an online free course “Data to Insight: An Introduction to Data Analysis” ("Data to Insight: An Introduction to Data Analysis The University of Auckland FutureLearn", 2014). This course is a hands-on introduction to statistical data analysis that emphasizes fundamental concepts and practical skills. This course also introduces the tool iNZight. One aspect google looks at while recruiting engineers is their knowledge on statistics and probability. The reason might be that they need people who understand the basics in deriving value from data. Using statistics, probability in combination with machine learning/artificial intelligence there are lot of predictions based on the data in various fields. Data science field is evolving at a rapid pace. Data visualization is important component of data science. These techniques will soon be applied to information security field for better identification of bad actors. One of the most important advantages of data visualization is that all the resources on data visualization are publicly available for learning the key concepts. There are numerous Coursera and eDX courses available for free about data visualization. There is extensive material about R project with numerous examples from various experts. If enough time is dedicated, data visualization tools and R can be learned easily by security analysts. Security Data Visualization pingbalaji@gmail.com The key advantage for security analysts is that security analysts have access to security data like security metrics data, network traffic data, malware indicators of compromise data, and many more. Security domain expertise is very important before starting data visualization, starting with the right question and the domain expertise will enable to get good output using data visualization. By using data visualization techniques on security data, security analysts can gain be valuable insights on metrics and anomaly detection. Hopefully these insights can make security practitioners jobs easier. Security Data Visualization pingbalaji@gmail.com 3.0 Security Data Visualization Process At a very high level the security visualization process consists of below five steps: (Security Data Visualization process) The key steps involved in visualization are Step 1 – Visualization Goals Step 2 Data Preparation phase Step 3 Exploration phase Step 4 Visualization phase Step 5 Feedback and fine-tune Visualization Goals