Parallelize the Runtime Checks -- Not the Application

Sequential and parallel applications are both prone to security and dependability bugs. Compilers can reduce the impact of these bugs by instrumenting runtime checks into the generated code. These runtime checks can have a dramatic negative impact on the performance of an application. For instance, our measurements show that compiler generated array-bounds checks can increase the application’s runtime by 20x. To make compiler generated checks practically usable, the introduced application slowdowns must be decreased. Single thread performance is only expected to grow slowly. To compensate these single thread slowdowns, it is very desirable to parallelize checked applications and/or their runtime checks. In this paper, we provide evidence that it is more promising to parallelize the runtime checks than the application: We therefore compare two frameworks using two different parallelization approaches: (1) parallelizing the application together with the runtime checks, and (2) parallelizing only the runtime checks. We focus on two frameworks that were developed in our group: Tanger The first implementation is the compiler extension Tanger [2] for software transactional memory (STM). Tanger puts every memory access within transactions under the control of the TinySTM++ library [3]. To apply this approach, an application must be parallelizable with transactional memory. ParExC ParExC (Parallel Execution Checking) parallelizes the runtime checks but not the application itself. Therefore, it is also suited for applications that are difficult to parallelize. ParExC executes the application without any runtime checks in the so called predictor process. The predictor’s execution is partitioned into epochs. Each epoch is replayed by an executor process, but this time with runtime checks. We scale by running the executors in parallel to each other and the predictor. Like Speck [5], ParExC allows a fast execution of the application by speculating on the success of the runtime checks. Speck parallelizes using dynamic binary instrumentation. Unlike Speck, ParExC parallelizes compiler generated checks at compile time. Roadmap In Sections 2 and 3, we introduce the two approaches in more details. Our experiments in Section 4 show that the specialized ParExC approach scales better than applications instrumented by Tanger for a checker with heavy runtime overhead if at-most 8 cores are available.

[1]  Kunle Olukotun,et al.  STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.

[2]  Jason Flinn,et al.  Parallelizing security checks on commodity hardware , 2008, ASPLOS.

[3]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[4]  Srikanth Kandula,et al.  Flashback: A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging , 2004, USENIX Annual Technical Conference, General Track.

[5]  Torvald Riegel,et al.  Dynamic performance tuning of word-based software transactional memory , 2008, PPoPP.

[6]  Torvald Riegel,et al.  Transactifying Applications Using an Open Compiler Framework , 2007 .