Challenging the "embarrassingly sequential": parallelizing finite state machine-based computations through principled speculation

Finite-State Machine (FSM) applications are important for many domains. But FSM computation is inherently sequential, making such applications notoriously difficult to parallelize. Most prior methods address the problem through speculations on simple heuristics, offering limited applicability and inconsistent speedups. This paper provides some principled understanding of FSM parallelization, and offers the first disciplined way to exploit application-specific information to inform speculations for parallelization. Through a series of rigorous analysis, it presents a probabilistic model that captures the relations between speculative executions and the properties of the target FSM and its inputs. With the formulation, it proposes two model-based speculation schemes that automatically customize themselves with the suitable configurations to maximize the parallelization benefits. This rigorous treatment yields near-linear speedup on applications that state-of-the-art techniques can barely accelerate.

[1]  Steven K. Thompson,et al.  Sample Size for Estimating Multinomial Proportions , 1987 .

[2]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[3]  Boris Alexeev Minimal DFA for testing divisibility , 2004, J. Comput. Syst. Sci..

[4]  Shmuel Tomi Klein,et al.  Parallel Huffman Decoding with Applications to JPEG Files , 2003, Comput. J..

[5]  Wei Lu,et al.  A Parallel Approach to XML Parsing , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[6]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[7]  Matteo Frigo,et al.  The implementation of the Cilk-5 multithreaded language , 1998, PLDI.

[8]  Rajiv Gupta,et al.  SpiceC: scalable parallelism via implicit copying and explicit commit , 2011, PPoPP '11.

[9]  Antonia Zhai,et al.  The STAMPede approach to thread-level speculation , 2005, TOCS.

[10]  Mark A. Franklin,et al.  Parallel Simulated Annealing using Speculative Computation , 1991, IEEE Trans. Parallel Distributed Syst..

[11]  Arun Raman,et al.  Speculative parallelization using software multi-threaded transactions , 2010, ASPLOS XV.

[12]  A. Salomaa Regular expression , 2003 .

[13]  Stephanie Forrest,et al.  Learning DFA representations of HTTP for protecting web applications , 2007, Comput. Networks.

[14]  Ming Yang,et al.  GPU-based NFA implementation for memory efficient high speed regular expression matching , 2012, PPoPP '12.

[15]  Jianhui Li,et al.  A Hybrid Parallel Processing for XML Parsing and Schema Validation , 2008 .

[16]  Martín Abadi,et al.  Semantics of transactional memory and automatic mutual exclusion , 2011, TOPL.

[17]  Ying Zhang,et al.  Speculative p-DFAs for parallel XML parsing , 2009, 2009 International Conference on High Performance Computing (HiPC).

[18]  Charles N. Fischer On parsing context free languages in parallel environments. , 1975 .

[19]  G. Ramalingam,et al.  Safe programmable speculative parallelism , 2010, PLDI '10.

[20]  Dean M. Tullsen,et al.  Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices , 2005, PLDI '05.

[21]  Harry Bunt,et al.  Advances in Probabilistic and Other Parsing Technologies , 2000 .

[22]  Kunle Olukotun,et al.  The Atomos transactional programming language , 2006, PLDI '06.

[23]  Bo Wu,et al.  Speculative parallelization needs rigor: probabilistic analysis for optimal speculation of finite-state machine applications , 2012, PACT.

[24]  Leo A. Meyerovich,et al.  Parallelizing the web browser , 2009 .

[25]  François Baccelli,et al.  An Asynchronous Parallel Interpreter for Arithmetic Expressions and Its Evaluation , 1986, IEEE Transactions on Computers.

[26]  Sanjeev Saxena,et al.  On Parallel Prefix Computation , 1994, Parallel Process. Lett..

[27]  Diego R. Llanos Ferraris,et al.  New Scheduling Strategies for Randomized Incremental Algorithms in the Context of Speculative Parallelization , 2007, IEEE Transactions on Computers.

[28]  Rajiv Gupta,et al.  Copy or Discard execution model for speculative parallelization on multicores , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[29]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .

[30]  Somesh Jha,et al.  Multi-byte Regular Expression Matching with Speculation , 2009, RAID.

[31]  Keshav Pingali,et al.  Optimistic parallelism requires abstractions , 2007, PLDI '07.

[32]  François Baccelli,et al.  On parsing arithmetic expressions in a multiprocessing environment , 2004, Acta Informatica.

[33]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.

[34]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[35]  Chen Ding,et al.  Software behavior oriented parallelization , 2007, PLDI '07.