Eliminating Input-Based Attacks by Deriving Automated Encoders and Decoders from Context-Free Grammars

Software systems nowadays communicate via a number of complex languages. This is often the cause of security vulnerabilities like arbitrary code execution, or injections. Whereby injections such as cross-site scripting are widely known from textual languages such as HTML and JSON that constantly gain more popularity. These systems use parsers to read input and unparsers write output, where these security vulnerabilities arise. Therefore correct parsing and unparsing of messages is of the utmost importance when developing secure and reliable systems. Part of the challenge developers face is to correctly encode data during unparsing and decode it during parsing. This paper presents McHammerCoder, an (un)parser and encoding generator supporting textual and binary languages. Those (un)parsers automatically apply the generated encoding, that is derived from the language's grammar. Therefore manually defining and applying encoding is not required to effectively prevent injections when using McHammerCoder. By specifying the communication language within a grammar, McHammerCoder provides developers with correct input and output handling code for their custom language.

[1]  Philip Wadler,et al.  A prettier printer , 2002 .

[2]  Christopher Krügel,et al.  Noxes: a client-side solution for mitigating cross-site scripting attacks , 2006, SAC '06.

[3]  Graham Hutton,et al.  Higher-order functions for parsing , 1992, Journal of Functional Programming.

[4]  Eelco Visser,et al.  Generation of formatters for context-free languages , 1996, TSEM.

[5]  Dawn Xiaodong Song,et al.  A Systematic Analysis of XSS Sanitization in Web Application Frameworks , 2011, ESORICS.

[6]  Klaus Ostermann,et al.  Invertible syntax descriptions: unifying parsing and pretty printing , 2010, Haskell '10.

[7]  Sergey Bratus,et al.  Beyond Planted Bugs in "Trusting Trust": The Input-Processing Frontier , 2014, IEEE Security & Privacy.

[8]  Bernhard Rumpe,et al.  Towards More Security in Data Exchange: Defining Unparsers with Context-Sensitive Encoders for Context-Free Grammars , 2015, 2015 IEEE Security and Privacy Workshops.

[9]  Nickolai Zeldovich,et al.  Nail: A Practical Tool for Parsing and Generating Data Formats , 2014, OSDI.

[10]  Vadim Zaytsev,et al.  Parsing in a Broad Sense , 2014, MoDELS.

[11]  Sergey Bratus,et al.  The Halting Problems of Network Stack Insecurity , 2011, login Usenix Mag..

[12]  Sergey Bratus,et al.  The Seven Turrets of Babel: A Taxonomy of LangSec Errors and How to Expunge Them , 2016, 2016 IEEE Cybersecurity Development (SecDev).

[13]  Jeroen D. Fokker,et al.  Functional Parsers , 1995, Advanced Functional Programming.

[14]  Meng Wang,et al.  FliPpr: A Prettier Invertible Printing System , 2013, ESOP.

[15]  Weijia Jia,et al.  The most efficient uniquely decipherable encoding schemes , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[16]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[17]  Terence Parr,et al.  The Definitive ANTLR 4 Reference , 2013 .

[18]  John Hughes,et al.  The Design of a Pretty-printing Library , 1995, Advanced Functional Programming.

[19]  Bernhard Rumpe,et al.  MontiCore: a framework for compositional development of domain specific languages , 2010, International Journal on Software Tools for Technology Transfer.