Improving the CubeSat reliability thanks to a multiprocessor system using fault tolerant online scheduling

Abstract More than 21 000 objects fly in outer space and are exposed to the harsh space environment. The size of space objects considerably varies. Our research focuses on nanosatellites, such as CubeSats, which have to respect time, spatial and energy constraints. To tackle this issue, this paper presents and evaluates two fault tolerant online scheduling algorithms: the algorithm scheduling all tasks as aperiodic (called OneOff ) and the algorithm placing arriving tasks as aperiodic or periodic tasks (called OneOff&Cyclic ). Based on several scenarios, we analyse how much the performance (in terms of both the rejection rate and the scheduling time) of ordering policies are influenced by the system load and the proportions of simple and double tasks to all tasks to be executed. The “Earliest Deadline” and “Earliest Arrival Time” ordering policies for OneOff or the “Minimum Slack” ordering policy for OneOff&Cyclic reject the least tasks in all tested scenarios. The paper also deals with the analysis of scheduling time to evaluate real-time performance of ordering policies and shows that OneOff requires less time to find a new schedule than OneOff&Cyclic . Finally, it was found that the studied algorithms perform well also in a harsh environment and provide the same availability as systems based on triple modular redundancy with very much less system power consumption.

[1]  Major Singh Goraya,et al.  A framework for priority based task execution in the distributed computing environment , 2015, 2015 International Conference on Signal Processing, Computing and Control (ISPCC).

[2]  Elizabeth Mabrouk What are SmallSats and CubeSats , 2015 .

[3]  Rami G. Melhem,et al.  Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems , 1997, IEEE Trans. Parallel Distributed Syst..

[4]  Dakai Zhu,et al.  On Reliability Management of Energy-Aware Real-Time Systems Through Task Replication , 2017, IEEE Transactions on Parallel and Distributed Systems.

[5]  Yves Robert,et al.  Introduction to Scheduling , 2009, CRC computational science series.

[6]  Henri Kuuste,et al.  Design of the fault tolerant command and data handling subsystem for ESTCube-1 , 2014 .

[7]  Elena Dubrova,et al.  Fault-Tolerant Design , 2013 .

[8]  Zsolt Tuza,et al.  Semi on-line algorithms for the partition problem , 1997, Oper. Res. Lett..

[9]  Martin Langer Reliability Assessment and Reliability Prediction of CubeSats through System Level Testing and Reliability Growth Modelling , 2018 .

[10]  Stijn Eyerman,et al.  Optimizing Soft Error Reliability Through Scheduling on Heterogeneous Multicore Processors , 2018, IEEE Transactions on Computers.

[11]  Christopher P. Bridges,et al.  Reliability analysis of multicellular system architectures for low-cost satellites , 2018 .

[12]  Sabine Bertho,et al.  ARDUSAT, an Arduino-Based CubeSat Providing Students with the Opportunity to Create their own Satellite Experiment and Collect Real-World Space Data , 2015 .

[13]  Giorgio Buttazzo,et al.  Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications , 1997 .

[14]  D. Burlyaev System-level Fault-Tolerance Analysis of Small Satellite On-Board Computers , 2012 .

[15]  Massimo Violante,et al.  Software-Implemented Hardware Fault Tolerance , 2010 .

[16]  Bharadwaj Veeravalli,et al.  On the Design of Fault-Tolerant Scheduling Strategies Using Primary-Backup Approach for Computational Grids with Low Replication Costs , 2009, IEEE Transactions on Computers.

[17]  Giorgio C. Buttazzo,et al.  HARD REAL-TIME COMPUTING SYSTEMS Predictable Scheduling Algorithms and Applications , 2007 .

[18]  Yong He,et al.  Semi-Online Algorithms for Parallel Machine Scheduling Problems , 2003, Computing.

[19]  C. P. Bridges,et al.  Satellite stem cells: The benefits & overheads of reliable, multicellular architectures , 2017, 2017 IEEE Aerospace Conference.

[20]  Xu Zhou,et al.  Fault-Tolerant Dynamic Rescheduling for Heterogeneous Computing Systems , 2015, Journal of Grid Computing.

[21]  Petr Dobiás Online Fault Tolerant Task Scheduling for Real-Time Multiprocessor Embedded Systems. (Contribution à l'ordonnancement dynamique, tolérant aux fautes, de tâches pour les systèmes embarqués temps-réel multiprocesseurs) , 2020 .

[22]  Manjeet Singh Performance analysis of checkpoint based efficient failure-aware scheduling algorithm , 2017, 2017 International Conference on Computing, Communication and Automation (ICCCA).

[23]  Emmanuel Casseau,et al.  Fault-Tolerant Online Scheduling Algorithms for CubeSats , 2020, PARMA-DITAM@HiPEAC.

[24]  Kenli Li,et al.  A Reliability-aware Task Scheduling Algorithm Based on Replication on Heterogeneous Computing Systems , 2017, Journal of Grid Computing.

[25]  Muhammad Fayyaz,et al.  Fault-Tolerant Distributed approach to satellite On-Board Computer design , 2014, 2014 IEEE Aerospace Conference.

[26]  Risat Mahmud Pathan,et al.  Real-time scheduling algorithm for safety-critical systems on faulty multicore environments , 2016, Real-Time Systems.

[27]  J CazorlaFrancisco,et al.  Multi-core Devices for Safety-critical Systems , 2020, ACM Comput. Surv..

[28]  Julian Proenza,et al.  Towards Efficient Probabilistic Scheduling Guarantees for Real-Time Systems Subject to Random Errors and Random Bursts of Errors , 2013, 2013 25th Euromicro Conference on Real-Time Systems.