Leveraging Time Prediction and Error Compensation to Enhance the Scalability of Parallel Multi-Core Simulations

Due to synchronization overhead, it is challenging to apply the parallel simulation technique of multi-core processors at larger scales. Although the use of lax synchronization schemes could reduce overhead and balance the load between synchronous points, it introduces timing error and deteriorates simulation accuracy. Through observing the propagation paths of errors, we find that these paths always concentrate on some pivotal events. Based on the observation, we design a delay-calibration mechanism to alleviate errors. We decouple the timing and functional processes of the pivotal events, leveraging prediction technique of delays to connect two categories of the processes. Errors are traced throughout the timing processes of the pivotal events, and are deducted from the predicted delays before the delays are consumed by the functional processes. Therefore, through cleaning the errors at the successive pivot events, the mechanism decreases the simulated time deviations efficiently. Since the prediction and error deduction processes do not have any constraint on synchronizations, our approach largely maintains the scalability of lax synchronization schemes. Furthermore, our proposal is orthogonal to other parallel simulation techniques and can be used in conjunction with them. Experimental results show that error compensation improves the accuracy of lax synchronized simulations by 68 percent and achieves 97.8 percent accuracy when combined with an enhanced lax synchronization.

[1]  Paolo Faraboschi,et al.  An Adaptive Synchronization Technique for Parallel Simulation of Networked Clusters , 2008, ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software.

[2]  R.M. Fujimoto,et al.  Parallel and distributed simulation systems , 2001, Proceeding of the 2001 Winter Simulation Conference (Cat. No.01CH37304).

[3]  James R. Larus,et al.  Wisconsin Wind Tunnel II: a fast, portable parallel architecture simulator , 2000, IEEE Concurr..

[4]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[5]  Lieven Eeckhout,et al.  BarrierPoint: Sampled simulation of multi-threaded applications , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[6]  Christoforos E. Kozyrakis,et al.  ZSim: fast and accurate microarchitectural simulation of thousand-core systems , 2013, ISCA.

[7]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[8]  Ronald G. Dreslinski,et al.  The M5 Simulator: Modeling Networked Systems , 2006, IEEE Micro.

[9]  David Wentzlaff,et al.  PriME: A parallel and distributed simulator for thousand-core chips , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[10]  Srinivas Devadas,et al.  Scalable, accurate multicore simulation in the 1000-core era , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.

[11]  Luca Benini,et al.  GPU Acceleration for Simulating Massively Parallel Many-Core Platforms , 2015, IEEE Transactions on Parallel and Distributed Systems.

[12]  Jianwei Chen,et al.  Adaptive and Speculative Slack Simulations of CMPs on CMPs , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[13]  George Kurian,et al.  Graphite: A distributed parallel simulator for multicores , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[14]  James R. Larus,et al.  The Wisconsin Wind Tunnel: virtual prototyping of parallel computers , 1993, SIGMETRICS '93.

[15]  José Duato,et al.  Efficient, Scalable Congestion Management for Interconnection Networks , 2006, IEEE Micro.

[16]  Paolo Faraboschi,et al.  COTSon: infrastructure for full system simulation , 2009, OPSR.

[17]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[18]  Laxmikant V. Kalé,et al.  BigSim: a parallel simulator for performance prediction of extremely large parallel machines , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[19]  Jianwei Chen,et al.  SlackSim: a platform for parallel simulations of CMPs on CMPs , 2009, CARN.

[20]  Pedro López,et al.  Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors , 2007, 19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07).

[21]  Christian Bienia,et al.  PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .

[22]  James E. Smith,et al.  Statistical simulation of symmetric multiprocessor systems , 2002, Proceedings 35th Annual Simulation Symposium. SS 2002.

[23]  Tao Li,et al.  Wall-clock based synchronization: A parallel simulation technology for cluster systems , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[24]  Alan D. George,et al.  Parallel simulation of chip-multiprocessor architectures , 2002, TOMC.

[25]  Roland E. Wunderlich,et al.  SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..