Communication and migration energy aware task mapping for reliable multiprocessor systems

Heterogeneous multiprocessor systems-on-chip (MPSoCs) are emerging as a promising solution in deep sub-micron technology nodes to satisfy design performance and power requirements. However, shrinking transistor geometry and aggressive voltage scaling are negatively impacting the dependability of these MPSoCs by increasing the chances of failures. This paper proposes an offline (design-time) task remapping technique to minimize the communication energy and task migration overhead of an application mapped on a heterogeneous multiprocessor system for all processor fault-scenarios. The proposed technique involves two steps-(1) Communication Energy driven Design Space Exploration (CDSE) to select an initial mapping and (2) Communication energy and Migration overhead aware Task Mapping (CMTM) for different fault-scenarios. The CDSE is formulated as a Mixed Integer Quadratic Programming (MIQP) problem and solved using an energy-gradient based heuristic. The CMTM problem is solved using a fast heuristic with the starting mapping selected using CDSE step. The proposed two steps technique is compared with state-of-the-art approaches through rigorous simulations with synthetic and real application graphs. Results demonstrate that the proposed CDSE reduces design space exploration time by 99% with a maximum variation of 5% from the optimal solution obtained by solving the MIQP problem directly. Further, the proposed CMTM reduces communication energy by an average 35% and migration overhead by an average 20% for all single and double fault-scenarios as compared to the existing fault-tolerant techniques. The CMTM also achieves over 30x reductions in execution time for large problem sizes with a maximum deviation of 15% from the minimum communication energy achievable with the given application on a given architecture. For streaming multimedia applications, the proposed technique delivers 50% higher throughput per unit energy as compared to the existing approaches. A two-step methodology is proposed for mapping applications on multiprocessor systems.In first step communication energy aware design space exploration is performed.In second step, mapping is performed to minimize communication and migration overhead.A significant energy savings is reported using the proposed methodology.

[1]  Bharadwaj Veeravalli,et al.  Energy-Aware Communication and Remapping of Tasks for Reliable Multimedia Multiprocessor Systems , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[2]  Teofilo F. Gonzalez,et al.  P-Complete Approximation Problems , 1976, J. ACM.

[3]  Wayne Wolf,et al.  Hardware-software co-design of embedded systems , 1994, Proc. IEEE.

[4]  Luca Benini,et al.  Packetized on-chip interconnect communication analysis for MPSoC , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[5]  Alan D. George,et al.  RapidIO for radar processing in advanced space systems , 2007, TECS.

[6]  E.A. Lee,et al.  Synchronous data flow , 1987, Proceedings of the IEEE.

[7]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[8]  Hokeun Kim,et al.  A task remapping technique for reliable multi-core embedded systems , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[9]  A. Singh,et al.  Fault-tolerant systems , 1990, Computer.

[10]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[11]  Cristian Constantinescu,et al.  Trends and Challenges in VLSI Circuit Reliability , 2003, IEEE Micro.

[12]  Bharadwaj Veeravalli,et al.  Communication and migration energy aware design space exploration for multicore systems with intermittent faults , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[13]  Radu Marculescu,et al.  Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[14]  Alan Burns,et al.  A survey of hard real-time scheduling for multiprocessor systems , 2011, CSUR.

[15]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[16]  Onur Derin,et al.  Online task remapping strategies for fault-tolerant Network-on-Chip multiprocessors , 2011, Proceedings of the Fifth ACM/IEEE International Symposium.

[17]  Qiang Xu,et al.  Energy-efficient task allocation and scheduling for multi-mode MPSoCs under lifetime reliability constraint , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[18]  Tongquan Wei,et al.  Reliability-Driven Energy-Efficient Task Scheduling for Multiprocessor Real-Time Systems , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[19]  Niraj K. Jha,et al.  Low power system scheduling and synthesis , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[20]  Akash Kumar,et al.  Fault-aware task re-mapping for throughput constrained multimedia applications on NoC-based MPSoCs , 2012, 2012 23rd IEEE International Symposium on Rapid System Prototyping (RSP).

[21]  Heon Young Yeom,et al.  An efficient recovery scheme for fault-tolerant mobile computing systems , 2003, Future Gener. Comput. Syst..

[22]  Sander Stuijk,et al.  SDF^3: SDF For Free , 2006, Sixth International Conference on Application of Concurrency to System Design (ACSD'06).

[23]  Ahmed Amine Jerraya,et al.  Multiprocessor System-on-Chip (MPSoC) Technology , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[24]  Charng-Da Lu,et al.  Reliability challenges in large systems , 2006, Future Gener. Comput. Syst..

[25]  Y.-K. Kwok,et al.  Static scheduling algorithms for allocating directed task graphs to multiprocessors , 1999, CSUR.

[26]  Ahmed Amine Jerraya,et al.  Platform-based software design flow for heterogeneous MPSoC , 2008, TECS.

[27]  Yuping Zhang,et al.  Workload-balancing schedule with adaptive architecture of MPSoCs for fault tolerance , 2010, 2010 3rd International Conference on Biomedical Engineering and Informatics.

[28]  Alex Orailoglu,et al.  Predictable execution adaptivity through embedding dynamic reconfigurability into static MPSoC schedules , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[29]  Amit Kumar Singh,et al.  Communication-aware heuristics for run-time task mapping on NoC-based MPSoC platforms , 2010, J. Syst. Archit..

[30]  Bharadwaj Veeravalli,et al.  Design of Fast and Efficient Energy-Aware Gradient-Based Scheduling Algorithms Heterogeneous Embedded Multiprocessor Systems , 2009, IEEE Transactions on Parallel and Distributed Systems.

[31]  Bharadwaj Veeravalli,et al.  Reliability-driven task mapping for lifetime extension of networks-on-chip based multiprocessor systems , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[32]  Dakai Zhu,et al.  Reliability-Aware Energy Management for Periodic Real-Time Tasks , 2007, 13th IEEE Real Time and Embedded Technology and Applications Symposium (RTAS'07).

[33]  Jean-Marc Geib,et al.  A fault-tolerant parallel heuristic for assignment problems , 1998, Future Gener. Comput. Syst..