Defining the Malice Space with Natural Language Processing Techniques

An important step toward cyber security is understanding the attack space or malice space of the system—the various sequences of actions that could be used to exploit that sys-tem. For this purpose, the cyber security community has developed techniques such as attack trees [1]. Even com-modity devices can have large and complex malice spaces that are difficult to define. Formally representing large, complex spaces (e.g., the space of English sentences) is a central concern in linguistics and natural language pro-cessing, and we will show that the techniques developed for natural language processing can be applied to cyber security to provide significant advantages over techniques currently used to define the malice space.