An Asynchronous Checkpoint-Based Redundant Multithreading Architecture
暂无分享,去创建一个
Existing redundant multithreading (RMT) detects faults by comparing the result of each instruction between the master and slave threads, which can lead to huge comparison and communication overhead. To address this problem, the checkpoint-based RMT (like RVQ_F) was proposed, but in such architectures, master threads must wait for slave threads to arrive at the same position at each checkpoint, this may delay the release of resources occupied by master threads and decrease performance. This paper proposes an asynchronous checkpoint-based redundant multithreading architecture (AC-RMT), in which two context saving rooms are set aside for each thread, one for detecting faults, and the other for saving the last checkpoint used for fault restoration. Compared with RVQ_F, AC-RMT efficiently boosts performance because, by avoiding the waiting of master threads at checkpoints, resources can be released timely.
[1] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[2] Brad Calder,et al. Automatically characterizing large scale program behavior , 2002, ASPLOS X.
[3] Kanad Ghose,et al. Trade-Offs in Transient Fault Recovery Schemes for Redundant Multithreaded Processors , 2006, HiPC.
[4] Pradeep Dubey,et al. Platform 2015: Intel ® Processor and Platform Evolution for the Next Decade , 2005 .