On the Design of Minimal-Cost Pipeline Systems Satisfying Hard/Soft Real-Time Constraints

Pipeline systems provide high throughput for applications by overlapping the executions of tasks. In the architectures with heterogeneity, two basic issues in the design of application-specific pipelines need to be studied: what type of functional unit to execute each task, and where to place buffers. Due to the increasing complexity of applications, pipeline designs face a bundle of problems. One of the most challenging problems is the uncertainty on the execution times, which makes the deterministic techniques inapplicable. In this paper, the execution times are modeled as random variables. Given an application, our objective is to construct the optimal pipeline, such that the total cost of the resultant pipeline can be minimized while satisfying the required timing constraints with the given guaranteed probability. We first prove the NP-hardness of the problem. Then, we present Mixed Integer Linear Programming (MILP) formulations to obtain the optimal solution. Due to the high time complexity of MILP, we devise an efficient <inline-formula><tex-math notation="LaTeX">$(1+\varepsilon)$</tex-math><alternatives><mml:math><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>ɛ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="sha-ieq1-2788800.gif"/></alternatives></inline-formula>-approximation algorithm, where the value of <inline-formula><tex-math notation="LaTeX">$\varepsilon$</tex-math><alternatives><mml:math><mml:mi>ɛ</mml:mi></mml:math><inline-graphic xlink:href="sha-ieq2-2788800.gif"/></alternatives></inline-formula> is less than 5 percent in practice. Experimental results show that our algorithms can achieve significant reductions in cost over the existing techniques, reaching up to 31.93 percent on average.

[1]  Francisco J. Cazorla,et al.  A cache design for probabilistically analysable real-time systems , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  Chuang Liu,et al.  Energy Efficient Task Assignment with Guaranteed Probability Satisfying Timing Constraints for Embedded Systems , 2014, IEEE Transactions on Parallel and Distributed Systems.

[3]  QiuMeikang,et al.  Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems , 2009 .

[4]  Sebastian Hack,et al.  A Framework for the Derivation of WCET Analyses for Multi-core Processors , 2016, 2016 28th Euromicro Conference on Real-Time Systems (ECRTS).

[5]  Edwin Hsing-Mean Sha,et al.  Application Mapping and Scheduling for Network-on-Chip-Based Multiprocessor System-on-Chip With Fine-Grain Communication Optimization , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Francisco J. Cazorla,et al.  Efficient Cache Designs for Probabilistically Analysable Real-Time Systems , 2014, IEEE Transactions on Computers.

[7]  Meikang Qiu,et al.  Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems , 2009, TODE.

[8]  Charles E. Leiserson,et al.  Retiming synchronous circuitry , 1988, Algorithmica.

[9]  William Thies,et al.  StreamIt: A Language for Streaming Applications , 2002, CC.

[10]  Sri Parameswaran,et al.  Performance Estimation of Pipelined MultiProcessor System-on-Chips (MPSoCs) , 2014, IEEE Transactions on Parallel and Distributed Systems.

[11]  Francisco J. Cazorla,et al.  Containing timing-related certification cost in automotive systems deploying complex hardware , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Peng Chen,et al.  Task mapping on SMART NoC: Contention matters, not the distance , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[13]  Francisco J. Cazorla,et al.  Probabilistic timing analysis on time-randomized platforms for the space domain , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[14]  Lei Zhou,et al.  Optimal Functional-Unit Assignment for Heterogeneous Systems Under Timing Constraint , 2017, IEEE Transactions on Parallel and Distributed Systems.

[15]  Edwin Hsing-Mean Sha,et al.  Efficient assignment and scheduling for heterogeneous DSP systems , 2005, IEEE Transactions on Parallel and Distributed Systems.

[16]  Sander Stuijk,et al.  Buffer Sizing for Rate-Optimal Single-Rate Data-Flow Scheduling Revisited , 2010, IEEE Transactions on Computers.

[17]  Nikil D. Dutt,et al.  SmartBalance: A sensing-driven linux load balancer for energy efficiency of heterogeneous MPSoCs , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[18]  Yiran Chen,et al.  PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[19]  Edwin Hsing-Mean Sha,et al.  Scheduling Data-Flow Graphs via Retiming and Unfolding , 1997, IEEE Trans. Parallel Distributed Syst..

[20]  Sri Parameswaran,et al.  A smart random code injection to mask power analysis based side channel attacks , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[21]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[22]  Edwin Hsing-Mean Sha,et al.  Optimal functional unit assignment and voltage selection for pipelined MPSoC with guaranteed probability on time performance , 2017, LCTES.

[23]  Shiann-Rong Kuang,et al.  Partitioning and Pipelined Scheduling of Embedded System Using Integer Linear Programming , 2005, 11th International Conference on Parallel and Distributed Systems (ICPADS'05).

[24]  Michael I. Gordon,et al.  Exploiting coarse-grained task, data, and pipeline parallelism in stream programs , 2006, ASPLOS XII.

[25]  Xin He,et al.  Optimal synthesis of latency and throughput constrained pipelined MPSoCs targeting streaming applications , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[26]  Meikang Qiu,et al.  Randomized execution algorithms for smart cards to resist power analysis attacks , 2012, J. Syst. Archit..

[27]  Edwin Hsing-Mean Sha,et al.  On the Design of High-Performance and Energy-Efficient Probabilistic Self-Timed Systems , 2015, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems.

[28]  Sander Stuijk,et al.  Throughput-Buffering Trade-Off Exploration for Cyclo-Static and Synchronous Dataflow Graphs , 2008, IEEE Transactions on Computers.

[29]  Wayne H. Wolf,et al.  TGFF: task graphs for free , 1998, Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE'98).

[30]  Lin Wu,et al.  Synthesizing distributed pipelining systems with timing constraints via optimal functional unit assignment and communication selection , 2017, J. Comput. Sci..

[31]  Chao Chen,et al.  Hardware-software collaboration for dark silicon heterogeneous many-core systems , 2017, Future Gener. Comput. Syst..

[32]  Tulika Mitra,et al.  Energy-efficient execution of data-parallel applications on heterogeneous mobile platforms , 2015, 2015 33rd IEEE International Conference on Computer Design (ICCD).

[33]  Jian-Jia Chen,et al.  Optimistic Reliability Aware Energy Management for Real-Time Tasks with Probabilistic Execution Times , 2008, 2008 Real-Time Systems Symposium.