Binary Analysis for Autonomous Hacking: Invited Abstract

Despite the rise of interpreted languages and the World Wide Web, binary analysis has remained the focus of much research in computer security. There are several reasons for this. First, interpreted languages are either interpreted by binary programs or Just-In-Time compiled down to binary code. Second, "core" OSconstructs and performance-critical applications are still writtenin languages (usually, C or C++) that compile down to binary code. Third, the rise of the Internet of Things is powered by devices that are, in general, very resource-constrained. Without cycles to waste on interpretation or Just-In-Time compilation, the firmware of these devices tends to be written in languages (again, usually C) that compile to binary. Unfortunately, many of these languages provide few security guarantees, often leading to vulnerabilities. For example, buffer overflows stubbornly remain as one of the most common discovered software flaws despite efforts to develop technologies to mitigate such vulnerabilities. Worse, the wider class of memory corruption vulnerabilities", the vast majority of which also stem from the use of unsafe languages, make up a substantial portion of the most common vulnerabilities. This problem is not limited to software on general-purpose computing devices: remotely exploitable vulnerabilities have been discovered in devices ranging from smart locks, to pacemakers, to automobiles. However, finding vulnerabilities in binaries and generating patches that fix exploitable flaws is challenging because of the lack of high-level abstractions, such as type information and control ow constructs. Current approaches provide tools to support the manual analysis of binaries, but are far from being completely automated solutions to the vulnerability analysis of binary programs. To foster research in automated binary analysis, in October of 2013, DARPA announced the DARPA Cyber Grand Challenge (CGC). Like DARPA Grand Challenges in other fields (such as robotics and autonomous vehicles), the CGC pits teams from around the world against each other in a competition in which the participants are autonomous systems. During the CGC competition, these systems must identify, exploit, and patch vulnerabilities in binary programs, without any human in the loop. Millions of dollars in prize money were announced: the top 7 teams to complete the CGC Qualifying Event (held in June, 2015) received 750,000 USD, and the top 3 teams in the CGC Final Event (held in August, 2016) will receive 2,000,000 USD, 1,000,000 USD, and 750,000 USD, respectively. The Shellphish hacking team is one of the qualified teams. This talk presents some insights into the field of automated binary analysis exploitation and patching, gained through the participation in the CGC competition. In addition, the talk provides a discussion of the use of competitions to foster both research and education, based on the experience in designing and running a large-scale live security hacking competition (called the iCTF) for the past 13 years.