FAMe-TM: Formal analysis methodology for task migration algorithms in Many-Core systems

Abstract Distributed Dynamic Thermal Management (dDTM) through task migrations across cores provides a very promising solution to cater for the heating issues in Many-Core architectures. However, the growing number of cores, the distributed nature of dDTM and the inherent sampling-based nature of traditional analysis techniques, like simulation and emulation, makes a complete and rigorous analysis of these task migration algorithms almost impossible. These limitations compromise the analysis integrity and in worst cases may lead to the deployment of an inefficient and inaccurate dDTM scheme on chip, which in turn can cause permanent defects in the chip due to excessive heating. Leveraging upon the exhaustive nature of model checking based verification, we propose to use a model checker to formally verify task migration algorithms. This work proposes an analysis methodology, i.e., Formal Analysis Methodology for Task Migrations (FAMe-TM), and identifies a generic set of properties for the formal verification of task-migration-based dDTM schemes. In particular, we propose an analysis flow using the scalable bounded model checker, nuXmv, to formally verify the suggested task migration properties, like tasks migrations, stalls, completion, creation of hot spots, time spent in migration and time to achieve stability. For illustration purposes, we apply FAMe-TM to two recently proposed task-migration-based dDTM schemes, i.e., Thermal Coupling Aware (TCA-TM) dDTM and Hot Spot Reduction (HR-TM) dDTM.

[1]  Saurabh Dighe,et al.  Within-Die Variation-Aware Dynamic-Voltage-Frequency-Scaling With Optimal Core Allocation and Thread Hopping for the 80-Core TeraFLOPS Processor , 2011, IEEE Journal of Solid-State Circuits.

[2]  Augustus K. Uht,et al.  Central vs. distributed dynamic thermal management for multi-core processors: which one is better? , 2009, GLSVLSI '09.

[3]  Helmut Veith,et al.  Progress on the State Explosion Problem in Model Checking , 2001, Informatics.

[4]  Emal Pasarly Time , 2011, Encyclopedia of Evolutionary Psychological Science.

[5]  Marco Roveri,et al.  The nuXmv Symbolic Model Checker , 2014, CAV.

[6]  Norbert Wehn,et al.  Reliable on-chip systems in the nano-era: Lessons learnt and future trends , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[7]  Yongcan Cao,et al.  Distributed computation of the average of multiple time-varying reference signals , 2011, Proceedings of the 2011 American Control Conference.

[8]  Chia-Lin Yang,et al.  Thermal coupling aware task migration using neighboring core search for many-core systems , 2013, 2013 International Symposium onVLSI Design, Automation, and Test (VLSI-DAT).

[9]  Guanglei Liu,et al.  Neighbor-aware dynamic thermal management for multi-core platform , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[10]  Jian-Jia Chen,et al.  Thermal-aware lifetime reliability in multicore systems , 2010, 2010 11th International Symposium on Quality Electronic Design (ISQED).

[11]  Luca Benini,et al.  Thermal and Energy Management of High-Performance Multicores: Distributed and Self-Calibrating Model-Predictive Controller , 2013, IEEE Transactions on Parallel and Distributed Systems.

[12]  Muhammad Shafique,et al.  Formal probabilistic analysis of distributed dynamic thermal management , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[13]  Michael R. Inggs,et al.  Towards a many-core architecture for HPC , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[14]  Edmund M. Clarke,et al.  Model Checking , 1999, Handbook of Automated Reasoning.

[15]  Hamid Noori,et al.  Proactive task migration with a self-adjusting migration threshold for dynamic thermal management of multi-core processors , 2014, The Journal of Supercomputing.

[16]  Kevin Skadron,et al.  Many Core Design from a Thermal Perspective: Extended Analysis and Results , 2008 .

[17]  Mohammad Abdullah Al Faruque,et al.  Runtime Thermal Management Using Software Agents for Multi- and Many-Core Architectures , 2010, IEEE Design & Test of Computers.

[18]  Sarma B. K. Vrudhula,et al.  Temperature-Aware DVFS for Hard Real-Time Applications on Multicore Processors , 2012, IEEE Transactions on Computers.

[19]  Pradip Bose,et al.  Multicore power management: Ensuring robustness via early-stage formal verification , 2009, 2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design.

[20]  Yongcan Cao,et al.  Distributed Average Tracking of Multiple Time-Varying Reference Signals With Bounded Derivatives , 2012, IEEE Transactions on Automatic Control.

[21]  Sandeep K. Shukla,et al.  Using probabilistic model checking for dynamic power management , 2005, Formal Aspects of Computing.

[22]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[23]  Armin Biere,et al.  Bounded model checking , 2003, Adv. Comput..

[24]  Sheldon X.-D. Tan,et al.  Distributed task migration for thermal hot spot reduction in many-core microprocessors , 2013, 2013 IEEE 10th International Conference on ASIC.

[25]  Muhammad Shafique,et al.  Formal verification of distributed dynamic thermal management , 2013, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[26]  Muhammad Shafique,et al.  Formal Verification of Distributed Task Migration for Thermal Management in On-Chip Multi-core Systems Using nuXmv , 2014, FTSCS.

[27]  Sofiène Tahar,et al.  Formal Verification Methods , 2015 .

[28]  Sheldon X.-D. Tan,et al.  Dynamic thermal management for multi-core microprocessors considering transient thermal effects , 2013, 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC).

[29]  Sheldon X.-D. Tan,et al.  Task Migrations for Distributed Thermal Management Considering Transient Effects , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[30]  Jorg Henkel,et al.  Agent-based distributed power management for kilo-core processors , 2013, ICCAD.