Workload-balancing schedule with adaptive architecture of MPSoCs for fault tolerance

With the scaling of semiconductor technology, the reliability of embedded multiprocessor systems has become one of the major concerns of the industry. Meanwhile, the communication cost of processors on a chip is becoming a hot topic both in research and in product development. However, most list scheduling heuristics rely on the assumption that processors in the systems are completely safe. To schedule precedence graphs in a more realistic framework, we propose a bus-based adaptive architecture and introduce a workload-balancing schedule algorithm for fault tolerance in this paper. The proposed techniques are capable of balancing the load among processors, supporting one processor failure and eliminating the communication cost due to task migration upon one processor fails. The performance evaluation of the proposed method is carried out by incorporating it into a well known heuristic scheduling, and the experimental results fully demonstrate the usefulness of the proposed algorithm.

[1]  Tughrul Arslan,et al.  Dynamically reconfigurable NoC for reconfigurable MPSoC , 2005, Proceedings of the IEEE 2005 Custom Integrated Circuits Conference, 2005..

[2]  Rami G. Melhem,et al.  Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems , 1997, IEEE Trans. Parallel Distributed Syst..

[3]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[4]  Tajana Simunic,et al.  Temperature-aware MPSoC scheduling for reducing hot spots and gradients , 2008, 2008 Asia and South Pacific Design Automation Conference.

[5]  Gianluca Palermo,et al.  Exploration of distributed shared memory architectures for NoC-based multiprocessors , 2007, J. Syst. Archit..

[6]  Mario Porrmann,et al.  Self-optimization of MPSoCs Targeting Resource Efficiency and Fault Tolerance , 2009, 2009 NASA/ESA Conference on Adaptive Hardware and Systems.

[7]  Jun Gu,et al.  FAST: a low-complexity algorithm for efficient scheduling of DAGs on parallel processors , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[8]  Ahmed Amine Jerraya,et al.  Multiprocessor System-on-Chip (MPSoC) Technology , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Wayne H. Wolf,et al.  TGFF: task graphs for free , 1998, Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE'98).

[10]  Chun-HuaYang,et al.  Fault-Tolerant Scheduling for Real-Time Embedded Control Systems , 2004 .

[11]  Alex Orailoglu,et al.  Predictable execution adaptivity through embedding dynamic reconfigurability into static MPSoC schedules , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[12]  Mohammad Hosseinabady,et al.  Fault-tolerant dynamically reconfigurable NoC-based SoC , 2008, 2008 International Conference on Application-Specific Systems, Architectures and Processors.

[13]  José A. B. Fortes,et al.  The Full-Use-of-Suitable-Spares (FUSS) Approach to Hardware Reconfiguration for Fault-Tolerant Processor Arrays , 1990, IEEE Trans. Computers.