FlexSig: Implementing flexible hardware signatures

With the advent of chip multiprocessors, new techniques have been developed to make parallel programing easier and more reliable. New parallel programing paradigms and new methods of making the execution of programs more efficient and more reliable have been developed. Usually, these improvements require hardware support to avoid a system slowdown. Signatures based on Bloom filters are widely used as hardware support for parallel programing in chip multiprocessors. Signatures are used in Transactional Memory, thread-level speculation, parallel debugging, deterministic replay and other tools and applications. The main limitation of hardware signatures is the lack of flexibility: if signatures are designed with a given configuration, tailored to the requirements of a specific tool or application, it is likely that they do not fit well for other different requirements. In this paper a new hardware signature organization, called Flexible Signatures (FlexSig), is proposed. FlexSig can change dynamically the resources assigned to a given signature and the number of signatures in the system, by redistributing the available hardware resources according to the system requirements. This allows higher flexibility than with traditional fixed-resources signatures based on Bloom filters, while maintaining a low false positive rate. FlexSig has been evaluated by comparing it with signatures based on parallel Bloom filters, and we conclude that FlexSig outperforms (in terms of false positive rate) conventional parallel Bloom filters in most cases, due to its ability to use all the signature resources available.

[1]  Scott A. Mahlke,et al.  Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory , 2009, PLDI '09.

[2]  Barton P. Miller,et al.  Detecting Data Races in Parallel Program Executions , 1989 .

[3]  Barton P. Miller,et al.  Detecting data races on weak memory systems , 1991, ISCA '91.

[4]  Brandon Lucia,et al.  Atom-Aid: Detecting and Surviving Atomicity Violations , 2009, IEEE Micro.

[5]  Jong-Deok Choi,et al.  Efficient and precise datarace detection for multithreaded object-oriented programs , 2002, PLDI '02.

[6]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[7]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[8]  Barton P. Miller,et al.  Detecting Data Races on Weak Memory Systems , 1991, ISCA.

[9]  Maged M. Michael,et al.  RingSTM: scalable transactions with a single atomic instruction , 2008, SPAA '08.

[10]  Mark D. Hill,et al.  Signatures in transactional memory systems , 2009 .

[11]  Daniel Sánchez,et al.  Implementing Signatures for Transactional Memory , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[12]  M. V. Ramakrishna,et al.  Efficient Hardware Hashing Functions for High Performance Computers , 1997, IEEE Trans. Computers.

[13]  Larry Carter,et al.  Universal classes of hash functions (Extended Abstract) , 1977, STOC '77.

[14]  Woojin Choi,et al.  Implementation of unified signatures for transactional memory systems , 2011, 2011 IEEE 54th International Midwest Symposium on Circuits and Systems (MWSCAS).

[15]  Kang Li,et al.  Approximate caches for packet classification , 2004, IEEE INFOCOM 2004.

[16]  Amin Vahdat,et al.  Efficient Peer-to-Peer Keyword Searching , 2003, Middleware.

[17]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[18]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[19]  Emilio L. Zapata,et al.  Improving Signatures by Locality Exploitation for Transactional Memory , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[20]  David Hutchison,et al.  Scalable Bloom Filters , 2007, Inf. Process. Lett..

[21]  Stark C. Draper,et al.  Notary: Hardware techniques to enhance signatures , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[22]  Kunle Olukotun,et al.  STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.

[23]  Jong-Deok Choi,et al.  Hybrid dynamic data race detection , 2003, PPoPP '03.

[24]  John Kubiatowicz,et al.  Probabilistic location and routing , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[25]  Pin Zhou,et al.  HARD: Hardware-Assisted Lockset-based Race Detection , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[26]  Kunle Olukotun,et al.  Hardware acceleration of transactional memory on commodity systems , 2011, ASPLOS XVI.

[27]  David A. Wood,et al.  LogTM-SE: Decoupling Hardware Transactional Memory from Caches , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[28]  Josep Torrellas,et al.  SoftSig: Software-Exposed Hardware Signatures for Code Analysis and Optimization , 2008, IEEE Micro.

[29]  Josep Torrellas,et al.  Bulk Disambiguation of Speculative Threads in Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[30]  Mateo Valero,et al.  Dynamically Filtering Thread-Local Variables in Lazy-Lazy Hardware Transactional Memory , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.

[31]  Mark Moir,et al.  Debugging with Transactional Memory , 2006 .

[32]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[33]  Torvald Riegel,et al.  Time-based transactional memory with scalable time bases , 2007, SPAA '07.

[34]  Kunle Olukotun,et al.  Transactional memory coherence and consistency , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[35]  Bradley C. Kuszmaul,et al.  Unbounded transactional memory , 2005, 11th International Symposium on High-Performance Computer Architecture.

[36]  Zhao Yuan,et al.  A Cascade Hash Design of Bloom Filter for Signature Detection , 2009, 2009 International Forum on Information Technology and Applications.

[37]  Darko Kirovski,et al.  Detecting and tolerating asymmetric races , 2009, PPoPP '09.

[38]  Barton P. Miller,et al.  Improving the accuracy of data race detection , 1991, PPOPP '91.

[39]  Bradley C. Kuszmaul,et al.  Unbounded Transactional Memory , 2005, HPCA.

[40]  Josep Torrellas,et al.  ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes , 2003, ISCA '03.

[41]  Josep Torrellas,et al.  SigRace: signature-based data race detection , 2009, ISCA '09.

[42]  Josep Torrellas,et al.  BulkSC: bulk enforcement of sequential consistency , 2007, ISCA '07.