Buffer minimization in pipelined SDF scheduling on multi-core platforms

With the increasing number of cores available on modern processors, it is imperative to solve the problem of mapping and scheduling a synchronous data flow graph onto a multi-core platform. Such a solution should not only meet the performance constraint, but also minimize resource usage. In this paper, we consider the pipeline scheduling problem for acyclic synchronous dataflow graph on a given number of cores to minimize the total buffer size while meeting the throughput constraint. We propose a two-level heuristic algorithm for this problem. The inner level finds the optimal buffer size for a given topological order of the input task graph; the outer level explores the space of topological order by applying perturbation to the topological order to improve buffer size. We compared our proposed algorithm to an enumeration algorithm which is able to generate optimal solution for small graphs, and a greedy algorithm which is able to run on large graphs. The experimental results show that our two-level heuristic algorithm achieves near-optimal solution compared to the enumeration algorithm, with only 0.8% increase in buffer size on average but with much shorter runtime, and achieves 38.8% less buffer usage on average, compared to the greedy algorithm.

[1]  Soonhoi Ha,et al.  Minimizing buffer requirements for throughput constrained parallel execution of synchronous dataflow graph , 2011, 16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011).

[2]  Sander Stuijk,et al.  Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[3]  Richard M. Karp,et al.  Dynamic programming meets the principle of inclusion and exclusion , 1982, Oper. Res. Lett..

[4]  Jason Cong,et al.  Synthesis of an application-specific soft multiprocessor system , 2007, FPGA '07.

[5]  Timothy W. O'Neil,et al.  Static Scheduling for Synchronous Data Flow Graphs , 2007, Computers and Their Applications.

[6]  Jan M. Rabaey,et al.  Scheduling of DSP programs onto multiprocessors for maximum throughput , 1993, IEEE Trans. Signal Process..

[7]  Ravindra K. Ahuja,et al.  Network Flows , 2011 .

[8]  Edward A. Lee,et al.  Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing , 1989, IEEE Transactions on Computers.

[9]  Sander Stuijk,et al.  Minimising buffer requirements of synchronous dataflow graphs with model checking , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[10]  Xiaobo Sharon Hu,et al.  Minimizing the number of delay buffers in the synchronization of pipelined systems , 1991, 28th ACM/IEEE Design Automation Conference.

[11]  Shahid H. Bokhari,et al.  Partitioning Problems in Parallel, Pipelined, and Distributed Computing , 1988, IEEE Trans. Computers.

[12]  Sander Stuijk,et al.  SDF^3: SDF For Free , 2006, Sixth International Conference on Application of Concurrency to System Design (ACSD'06).

[13]  Daniel Gajski,et al.  Partitioning and pipelining for performance-constrained hardware/software systems , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[14]  William J. Dally,et al.  Buffer-space efficient and deadlock-free scheduling of stream applications on multi-core architectures , 2010, SPAA '10.

[15]  Gerard J. M. Smit,et al.  Efficient computation of buffer capacities for multi-rate real-time systems with back-pressure , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).