Skewed pipelining for parallel simulink simulations

Modern automotive and aerospace embedded applications require very high-performance simulations that are able to produce new values every microsecond. Simulations must now rely on scalable performance of multi-core systems rather than faster clock frequencies. Novel parallelization techniques are needed to satisfy the industrial simulation demands that are essential for the development of safety-critical systems. Simulink formalism is the industrial de facto standard, but current state-of-the-art simulation and code generation techniques fail to fully exploit the parallelism in modern multi-core systems. However, closed-loop and dynamic system simulations are very difficult to parallelize because of the loop-carried dependencies. In this paper we introduce a novel skewed pipelining technique that overcomes these difficulties and allows loop-carried Simulink applications to be executed concurrently in multi-core systems. By delaying the forwarding of values for a few iterations, we can break some data dependencies and coarsen the granularity of programs. This improves the concurrency and reduces the high cost of inter-processor communication. Implementation studies to demonstrate the viability of our method on a commodity multi-core system with 2, 3, and 4 processors show a 1.72, 2.38, and 3.33 fold speedup over uniprocessor execution.

[1]  John McLeod,et al.  PHYSBE ... a physiological simulation benchmark experiment , 1966 .

[2]  Leon M. Tolbert,et al.  Simulink implementation of induction machine model - a modular approach , 2003, IEEE International Electric Machines and Drives Conference, 2003. IEMDC'03..

[3]  蒋志文,et al.  Real-Time Workshop实时仿真研究与应用 , 2007 .

[4]  Anoop Gupta,et al.  Parallel computer architecture - a hardware / software approach , 1998 .

[5]  M. Meyer,et al.  Production quality code generation from Simulink block diagrams , 1999, Proceedings of the 1999 IEEE International Symposium on Computer Aided Control System Design (Cat. No.99TH8404).

[6]  Michael I. Gordon,et al.  Exploiting coarse-grained task, data, and pipeline parallelism in stream programs , 2006, ASPLOS XII.

[7]  Manfred Broy,et al.  Engineering Automotive Software , 2007, Proceedings of the IEEE.

[8]  John Giacomoni,et al.  FastForward for Efficient Pipeline Parallelism , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[9]  Luigi Carro,et al.  Reducing fine-grain communication overhead in multithread code generation for heterogeneous MPSoC , 2007, SCOPES '07.

[10]  Rafael Asenjo,et al.  Analytical Modeling of Pipeline Parallelism , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[11]  Antonia Zhai,et al.  Compiler and hardware support for reducing the synchronization of speculative threads , 2008, TACO.

[12]  Easwaran Raman,et al.  Speculative Decoupled Software Pipelining , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).