Regex-Based Linkography Abstraction Refinement for Information Security

Linkographs have been used in the past to model behavioral patterns for creative professionals. Recently, linkographs have been applied to the context of cyber security to study the behavioral patterns of remote attackers of cyber systems. We propose a human supervised algorithm that refines abstractions to be used for linkographic analysis of common attack patterns. The refinement algorithm attempts to maximize the accuracy of computer-derived linkographs by optimally merging and splitting abstraction classes, represented as regular expressions (regexes). We first describe an algorithm to select and perform a globally optimal merge of two abstraction classes. We then describe a counterpart algorithm to select and split a single abstraction class into two separate ones. We cast a regex as a conjunction of disjunctions and refine it by adding and removing conjunctive and disjunctive elements. We also show how to use the Stoer-Wagner algorithm, normally used for least cost cuts of graphs, to create two optimal subsets of a set of elements.

[1]  Andrew Fisher,et al.  Using linkography to understand cyberattacks , 2015, 2015 IEEE Conference on Communications and Network Security (CNS).

[2]  Eric Medvet,et al.  Automatic Synthesis of Regular Expressions from Examples , 2014, Computer.

[3]  Andrew Fisher,et al.  Linkography ontology refinement and cyber security , 2017, 2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC).

[4]  Alvis Brazma,et al.  Learning of regular expressions by pattern matching , 1995, EuroCOLT.

[5]  Karlis Cerans,et al.  Efficient Learning of Regular Expressions from Good Examples , 1994, AII/ALT.

[6]  Frank Neven,et al.  Learning deterministic regular expressions for the inference of schemas from XML data , 2008, WWW.

[7]  E. Myers,et al.  Approximate matching of regular expressions. , 1989, Bulletin of mathematical biology.

[8]  Efim B. Kinber Learning Regular Expressions from Representative Examples and Membership Queries , 2010, ICGI.

[9]  Henning Fernau,et al.  Algorithms for learning regular expressions from positive data , 2009, Inf. Comput..

[10]  Eric Medvet,et al.  Learning Text Patterns Using Separate-and-Conquer Genetic Programming , 2015, EuroGP.

[11]  Mechthild Stoer,et al.  A simple min-cut algorithm , 1997, JACM.

[12]  E. Medvet,et al.  Inference of Regular Expressions for Text Extraction from Examples , 2016, IEEE Transactions on Knowledge and Data Engineering.

[13]  Sriram Raghavan,et al.  Regular Expression Learning for Information Extraction , 2008, EMNLP.

[14]  Duy Duc An Bui,et al.  Research and applications: Learning regular expressions for clinical text classification , 2014, J. Am. Medical Informatics Assoc..

[15]  Marci McBride,et al.  Linkography abstraction refinement and cyber security , 2017, 2017 IEEE Conference on Communications and Network Security (CNS).

[16]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[17]  Attilio Giordana,et al.  Learning Regular Expressions from Noisy Sequences , 2005, SARA.