High-performance parallel accelerator for flexible and efficient run-time monitoring

This paper proposes Harmoni, a high performance hardware accelerator architecture that can support a broad range of run-time monitoring and bookkeeping functions. Unlike custom hardware, which offers very little configurability after it has been fabricated, Harmoni is highly configurable and can allow a wide range of different hardware monitoring and bookkeeping functions to be dynamically added to a processing core even after the chip has already been fabricated. The Harmoni architecture achieves much higher efficiency than software implementations and previously proposed monitoring platforms by closely matching the common characteristics of run-time monitoring functions that are based on the notion of tagging. We implemented an RTL prototype of Harmoni and evaluated it with several example monitoring functions for security and programmability. The prototype demonstrates that the architecture can support a wide range of monitoring functions with different characteristics. Harmoni takes moderate silicon area, has very high throughput, and incurs low overheads on monitored programs.

[1]  Michael D. Smith,et al.  A high-performance microarchitecture with hardware-programmable functional units , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[2]  Guru Venkataramani,et al.  MemTracker: Efficient and Programmable Support for Memory Access Monitoring and Debugging , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[3]  Christoforos E. Kozyrakis,et al.  Hardware Enforcement of Application Security Policies Using Tagged Memory , 2008, OSDI.

[4]  Arun K. Somani,et al.  A reconfigurable multifunction computing cache architecture , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[5]  K. J. Bma Integrity considerations for secure computer systems , 1977 .

[6]  Christoforos E. Kozyrakis,et al.  Decoupling Dynamic Information Flow Tracking with a dedicated coprocessor , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[7]  Milo M. K. Martin,et al.  Hardbound: architectural support for spatial safety of the C programming language , 2008, ASPLOS.

[8]  Barry J. Epstein,et al.  The Sparc Architecture Manual/Version 8 , 1992 .

[9]  Krste Asanovic,et al.  Mondrian memory protection , 2002, ASPLOS X.

[10]  David Zhang,et al.  Secure program execution via dynamic information flow tracking , 2004, ASPLOS XI.

[11]  Christoforos E. Kozyrakis,et al.  Raksha: a flexible information flow architecture for software security , 2007, ISCA '07.

[12]  Frederic T. Chong,et al.  Complete information flow tracking from the gates up , 2009, ASPLOS.

[13]  Alessandro Orso,et al.  Effective memory protection using dynamic tainting , 2007, ASE '07.

[14]  Edward A. Feustel,et al.  On The Advantages of Tagged Architecture , 1973, IEEE Transactions on Computers.

[15]  Arun K. Somani,et al.  A reconfigurable multi-function computing cache architecture , 2000, FPGA '00.

[16]  Onur Mutlu,et al.  Flexible reference-counting-based hardware acceleration for garbage collection , 2009, ISCA '09.

[17]  Babak Falsafi,et al.  Flexible Hardware Acceleration for Instruction-Grain Program Monitoring , 2008, 2008 International Symposium on Computer Architecture.

[18]  Cheng Wang,et al.  LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[19]  Albert Meixner,et al.  Argus: Low-Cost, Comprehensive Error Detection in Simple Cores , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[20]  G. Edward Suh,et al.  Flexible and Efficient Instruction-Grained Run-Time Monitoring Using On-Chip Reconfigurable Fabric , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[21]  Ralph Wittig,et al.  OneChip: an FPGA processor with reconfigurable logic , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[22]  Guru Venkataramani,et al.  FlexiTaint: A programmable accelerator for dynamic taint propagation , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[23]  Scott Hauck,et al.  The Chimaera reconfigurable functional unit , 1997, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[25]  John Wawrzynek,et al.  Garp: a MIPS processor with a reconfigurable coprocessor , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[26]  Wei Liu,et al.  iWatcher: efficient architectural support for software debugging , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[27]  Timothy Sherwood,et al.  A small cache of large ranges: Hardware methods for efficiently searching, storing, and updating big dataflow tags , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[28]  G. Edward Suh,et al.  Precise exception support for decoupled run-time monitoring architectures , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).

[29]  Milo M. K. Martin,et al.  SoftBound: highly compatible and complete spatial memory safety for c , 2009, PLDI '09.

[30]  Frederic T. Chong,et al.  Minos: Control Data Attack Prevention Orthogonal to Memory Model , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[31]  Gary S. Tyson,et al.  Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[32]  Babak Falsafi,et al.  Log-based architectures for general-purpose monitoring of deployed code , 2006, ASID '06.