Towards optimization techniques for dynamic load balancing of parallel gate level simulation

As a consequence of Moore's law, the size of integrated circuits has grown extensively, resulting in simulation becoming the major bottleneck in the circuit design process. Consequently, parallel simulation has emerged as an approach which can be both fast and cost effective. In this thesis, we examine the performance of a parallel Verilog simulator, VXTW, on four large, real designs using an optimistic synchronization scheme named Time Warp. As previous work has made use of either relatively small benchmarks or synthetic circuits, the use of these circuits is far more realistic. Because of the low computational granularity of a gate level simulation and because the computational and communication loads vary throughout the course of the simulation, the performance of Time Warp can be severely degraded or can even be unstable. Dynamic load balancing algorithms for balancing the computational and communication loads during the simulation are described in this thesis. Like all load balancing algorithms, the proposed algorithms have some tuning parameters which must be optimized. In addition, in order to avoid the simulation from being too optimistic, we make use of a time window. In the thesis, we make use of learning techniques from artificial intelligence (N-armed Bandit, Multi-state Q-learning) and heuristic searches (Genetic Algorithm, Simulated Annealing) to tune the parameters of the dynamic load balancing algorithms and to determine the size of the time window. we evaluated the performance of these algorithms on open source Sparc and Leon processor designs and on two Viterbi decoder designs and observed up to a 70% improvement in simulation time using these approaches.

[1]  Philip A. Wilsey,et al.  Adaptive bounded time windows in an optimistically synchronized simulator , 1993, [1993] Proceedings Third Great Lakes Symposium on VLSI-Design Automation of High Performance VLSI Systems.

[2]  Philip A. Wilsey,et al.  Optimistic fossil collection for time warp simulation , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[3]  Samir Palnitkar,et al.  Verilog HDL: a guide to digital design and synthesis , 1996 .

[4]  Robert E. Tarjan,et al.  Almost-optimum speed-ups of algorithms for bipartite matching and related problems , 1988, STOC '88.

[5]  Jayadev Misra,et al.  Distributed discrete-event simulation , 1986, CSUR.

[6]  Pedro M. Vilarinho,et al.  A simulated annealing approach for manufacturing cell formation with multiple identical machines , 2003, Eur. J. Oper. Res..

[7]  Krishnan Subramani,et al.  Analysis and Simulation of Mixed-Technology VLSI Systems , 2002, J. Parallel Distributed Comput..

[8]  Wei Zhang,et al.  On the Scalability of Parallel Verilog Simulation , 2009, 2009 International Conference on Parallel Processing.

[9]  Yi-Bing Lin,et al.  Asynchronous parallel discrete event simulation , 1996, IEEE Trans. Syst. Man Cybern. Part A.

[10]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[11]  K. Mani Chandy,et al.  Distributed Simulation: A Case Study in Design and Verification of Distributed Programs , 1979, IEEE Transactions on Software Engineering.

[12]  Jaime G. Carbonell,et al.  Machine learning: paradigms and methods , 1990 .

[13]  Jon B. Weissman,et al.  MTW: an empirical performance study , 1991, 1991 Winter Simulation Conference Proceedings..

[14]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[15]  Ben Cohen VHDL Coding Styles and Methodologies , 1995 .

[16]  Richard M. Fujimoto,et al.  Adaptive Flow Control in Time Warp , 1997, Workshop on Parallel and Distributed Simulation.

[17]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[18]  Richard M. Fujimoto,et al.  Parallel and Distribution Simulation Systems , 1999 .

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Wanneng Shu,et al.  An Efficient Dynamic Load Balancing Scheme for Heterogenous Processing System , 2009, 2009 International Conference on Computational Intelligence and Natural Computing.

[21]  Santosh S. Vempala,et al.  Simulated Annealing for Convex Optimization , 2004 .

[22]  Samir Ranjan Das Adaptive protocols for parallel discrete event simulation , 1996, Winter Simulation Conference.

[23]  Carl Tropper,et al.  DVS: an object-oriented framework for distributed Verilog simulation , 2003, Seventeenth Workshop on Parallel and Distributed Simulation, 2003. (PADS 2003). Proceedings..

[24]  Behrokh Samadi Distributed simulation, algorithms and performance analysis (load balancing, distributed processing) , 1985 .

[25]  Prithviraj Banerjee Parallel algorithms for VLSI computer-aided design , 1994 .

[26]  P. Banerjee,et al.  Design and Implementation of an Actor Based Parallel VHDL Simulator , 1996 .

[27]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[28]  Alan Weiss,et al.  An analysis of rollback-based simulation , 1991, TOMC.

[29]  Herbert Bauer,et al.  Dynamic load balancing of a multi-cluster simulator on a network of workstations , 1995, PADS.

[30]  Azzedine Boukerche,et al.  An Efficient Dynamic Load Balancing Scheme for Distributed Simulations on a Grid Infrastructure , 2008, 2008 12th IEEE/ACM International Symposium on Distributed Simulation and Real-Time Applications.

[31]  Alois Ferscha Probabilistic adaptive direct optimism control in Time Warp , 1995, PADS.

[32]  Paul F. Reynolds,et al.  NPSI adaptive synchronization algorithms for PDES , 1995, Winter simulation conference : proceedings.

[33]  C. Shi,et al.  Parallel and distributed VHDL simulation , 2000, Proceedings Design, Automation and Test in Europe Conference and Exhibition 2000 (Cat. No. PR00537).

[34]  P. Banerjee,et al.  Actor Based Parallel VHDL Simulation Using Time Warp , 1996, Proceedings of Symposium on Parallel and Distributed Tools.

[35]  Richard M. Fujimoto,et al.  An Adaptive Memory Management Protocol for Time Warp Simulation , 1994, SIGMETRICS.

[36]  Carl Tropper,et al.  XTW, a parallel and distributed logic simulator , 2005, ASP-DAC '05.

[37]  Isao Ono,et al.  An Efficient Genetic Algorithm for Job Shop Scheduling Problems , 1995, International Conference on Genetic Algorithms.

[38]  Franz Rothlauf,et al.  Representations for genetic and evolutionary algorithms , 2002, Studies in Fuzziness and Soft Computing.

[39]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[40]  Guangwen Yang,et al.  Dynamic load balancing efficiently in a large-scale cluster , 2009, Int. J. High Perform. Comput. Netw..

[41]  Jun Wang,et al.  Optimizing time warp simulation with reinforcement learning techniques , 2007, 2007 Winter Simulation Conference.

[42]  M. Liljenstam,et al.  Transparent Incremental State Saving in Time Warp Parallel Discrete Event Simulation , 1996, Proceedings of Symposium on Parallel and Distributed Tools.

[43]  Zbigniew Michalewicz,et al.  Genetic algorithms + data structures = evolution programs (3rd ed.) , 1996 .

[44]  Richard M. Fujimoto,et al.  Computing global virtual time in shared-memory multiprocessors , 1997, TOMC.

[45]  Yury Nikulin Simulated annealing algorithm for the robust spanning tree problem , 2008, J. Heuristics.

[46]  Kenichi Ida,et al.  Improved genetic algorithm for VLSI floorplan design with non-slicing structure , 2006, Comput. Ind. Eng..

[47]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[48]  Sevin Fide Architectural Optimizations in Multi-Core Processors , 2008 .

[49]  Yangdong Deng,et al.  Distributed time, conservative parallel logic simulation on GPUs , 2010, Design Automation Conference.

[50]  Wei Zhang,et al.  On the scalability and dynamic load balancing of parallel Verilog simulations , 2009, Proceedings of the 2009 Winter Simulation Conference (WSC).

[51]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[52]  M. U. Kharat,et al.  A game-theoretic model for dynamic load balancing in distributed systems , 2009, ICAC3 '09.

[53]  Fabian Gomes,et al.  Cost of state saving & rollback , 1994, PADS '94.

[54]  Carl Tropper,et al.  On Rolling Back and Checkpointing in Time Warp , 2001, IEEE Trans. Parallel Distributed Syst..

[55]  Carl Tropper,et al.  Clustered time warp and logic simulation , 1995, PADS.

[56]  Averill M. Law,et al.  Simulation Modeling and Analysis , 1982 .

[57]  H. Avril,et al.  The Dynamic Load Balancing of Clustered Time Warp for Logic Simulation , 1996, Proceedings of Symposium on Parallel and Distributed Tools.