FADE: A programmable filtering accelerator for instruction-grain monitoring

Instruction-grain monitoring is a powerful approach that enables a wide spectrum of bug-finding tools. As existing software approaches incur prohibitive runtime overhead, researchers have focused on hardware support for instruction-grain monitoring. A recurring theme in recent work is the use of hardware-assisted filtering so as to elide costly software analysis. This work generalizes and extends prior point solutions into a programmable filtering accelerator affording vast flexibility and at-speed event filtering. The pipelined microarchitecture of the accelerator affords a peak filtering rate of one application event per cycle, which suffices to keep up with an aggressive OoO core running the monitored application. A unique feature of the proposed design is the ability to dynamically resolve dependencies between unfilterable events and subsequent events, eliminating data-dependent stalls and maximizing accelerator's performance. Our evaluation results show a monitoring slowdown of just 1.2-1.8x across a diverse set of monitoring tools.

[1]  G. Edward Suh,et al.  Flexible and Efficient Instruction-Grained Run-Time Monitoring Using On-Chip Reconfigurable Fabric , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[2]  Olatunji Ruwase,et al.  Parallelizing dynamic information flow tracking , 2008, SPAA '08.

[3]  Babak Falsafi,et al.  ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications , 2010, ASPLOS XV.

[4]  Thomas F. Wenisch,et al.  SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, ISCA '03.

[5]  Dan Grossman,et al.  RADISH: always-on sound and complete Ra D etection i n S oftware and H ardware , 2012, ISCA 2012.

[6]  Todd M. Austin,et al.  A case for unlimited watchpoints , 2012, ASPLOS XVII.

[7]  Frederic T. Chong,et al.  Minos: Control Data Attack Prevention Orthogonal to Memory Model , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[8]  Yuanyuan Zhou,et al.  AVIO: Detecting Atomicity Violations via Access-Interleaving Invariants , 2007, IEEE Micro.

[9]  Amir Roth,et al.  DISE: a programmable macro engine for customizing applications , 2003, ISCA '03.

[10]  Dan Grossman,et al.  RADISH: Always-on sound and complete race detection in software and hardware , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[11]  Thomas F. Wenisch,et al.  SimFlex: Statistical Sampling of Computer System Simulation , 2006, IEEE Micro.

[12]  Milo M. K. Martin,et al.  Hardbound: architectural support for spatial safety of the C programming language , 2008, ASPLOS.

[13]  C MowryTodd,et al.  Flexible Hardware Acceleration for Instruction-Grain Program Monitoring , 2008 .

[14]  Gu-Yeon Wei,et al.  Process Variation Tolerant 3T1D-Based Cache Architectures , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[15]  Stephen N. Freund,et al.  FastTrack: efficient and precise dynamic race detection , 2009, PLDI '09.

[16]  Milo M. K. Martin,et al.  Watchdog: Hardware for safe and secure manual memory management and full memory safety , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[17]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[18]  David Zhang,et al.  Secure program execution via dynamic information flow tracking , 2004, ASPLOS XI.

[19]  Christoforos E. Kozyrakis,et al.  Raksha: a flexible information flow architecture for software security , 2007, ISCA '07.

[20]  Koen De Bosschere,et al.  Precise Detection of Memory Leaks , 2004 .

[21]  Guru Venkataramani,et al.  MemTracker: Efficient and Programmable Support for Memory Access Monitoring and Debugging , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[22]  Todd M. Austin,et al.  Testudo: Heavyweight security analysis via statistical sampling , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[23]  Yan Solihin,et al.  HeapMon: A helper-thread approach to programmable, automatic, and low-overhead memory bug detection , 2006, IBM J. Res. Dev..

[24]  Sandhya Dwarkadas,et al.  Sentry: light-weight auxiliary memory access control , 2010, ISCA.

[25]  Babak Falsafi,et al.  Flexible Hardware Acceleration for Instruction-Grain Program Monitoring , 2008, 2008 International Symposium on Computer Architecture.

[26]  Norman P. Jouppi,et al.  Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[27]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[28]  Wei Liu,et al.  iWatcher: efficient architectural support for software debugging , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[29]  Guru Venkataramani,et al.  FlexiTaint: A programmable accelerator for dynamic taint propagation , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[30]  Michael D. Bond,et al.  Tracking bad apples: reporting the origin of null and undefined value errors , 2007, OOPSLA.