Loop Selection for Multilevel Nested Loops Using a Genetic Algorithm

Loop selection for multilevel nested loops is a very difficult problem, for which solutions through the underlying hardware-based loop selection techniques and the traditional software-based static compilation techniques are ineffective. A genetic algorithm- (GA-) based method is proposed in this study to solve this problem. First, the formal specification and mathematical model of the loop selection problem are presented; then, the overall framework for the GA to solve the problem is designed based on the mathematical model; finally, we provide the chromosome representation method and fitness function calculation method, the initial population generation algorithm and chromosome improvement methods, the specific implementation methods of genetic operators (crossover, mutation, and selection), the offspring population generation method, and the GA stopping criterion during the GA operation process. Experimental tests with the SPEC2006 and NPB3.3.1 standard test sets were performed on the Sunway TaihuLight supercomputer. The test results indicated that the proposed method can achieve a speedup improvement that is superior to that by the current mainstream methods, which confirm the effectiveness of the proposed method. Solving the loop selection problem of multilevel nested loops is of great practical significance for exploiting the parallelism of general scientific computing programs and for giving full play to the performance of multicore processors.

[1]  Honghui Li,et al.  Application research based on improved genetic algorithm in cloud task scheduling , 2020, J. Intell. Fuzzy Syst..

[2]  Hiroaki Hirata,et al.  Shelving a Code Block for Thread-Level Speculation , 2019, 2019 20th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD).

[3]  Antonia Zhai,et al.  Loop Selection for Thread-Level Speculation , 2005, LCPC.

[4]  G C Nandi,et al.  Controlling Multi Thread Execution using Single Thread Event Loop , 2017, 2017 International Conference on Innovations in Control, Communication and Information Systems (ICICCI).

[5]  Shuqin Li,et al.  Toward Emotion-Aware Computing: A Loop Selection Approach Based on Machine Learning for Speculative Multithreading , 2017, IEEE Access.

[6]  Pascal Felber,et al.  Extending hardware transactional memory capacity via rollback-only transactions and suspend/resume , 2019, Distributed Computing.

[7]  Albert Y. Zomaya,et al.  A general purpose contention manager for software transactions on the GPU , 2020, J. Parallel Distributed Comput..

[8]  Maurice Herlihy,et al.  Improving Parallelism in Hardware Transactional Memory , 2018, ACM Trans. Archit. Code Optim..

[9]  Kanemitsu Ootsu,et al.  Directive-Based Parallelization of For-Loops at LLVM IR Level , 2019, 2019 20th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD).

[10]  Thomas Rauber,et al.  How do Loop Transformations Affect the Energy Consumption of Multi-Threaded Runge-Kutta Methods? , 2018, 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP).

[11]  Onur Mutlu,et al.  Boyi: A Systematic Framework for Automatically Deciding the Right Execution Model of OpenCL Applications on FPGAs , 2020, FPGA.

[12]  Luigi Pomante,et al.  An OpenMP Parallel Genetic Algorithm for Design Space Exploration of Heterogeneous Multi-processor Embedded Systems , 2020, PARMA-DITAM@HiPEAC.

[13]  Antonio J. Tomeu,et al.  A Parallel Implementation for Cellular Potts Model with Software Transactional Memory , 2019, PACBB.

[14]  Allen,et al.  Optimizing Compilers for Modern Architectures , 2004 .

[15]  Alexandro Baldassin,et al.  A Proposal for Supporting Speculation in the OpenMP taskloop Construct , 2019, IWOMP.

[16]  Bin Liu,et al.  An Improved Programming Model for Thread-Level Speculation , 2019, 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom).

[17]  Rajeev Barua,et al.  Easy PRAM-Based High-Performance Parallel Programming with ICE , 2016, IEEE Transactions on Parallel and Distributed Systems.

[18]  Mohammad Hossein Refan,et al.  Analysis of asynchronous distributed multi-master parallel genetic algorithm optimization on CAN bus , 2020, Evol. Syst..

[19]  Sudipta Roy,et al.  Gossip based fault tolerant protocol in distributed transactional memory using quorum based replication system , 2019, Cluster Computing.

[20]  Anthony Widjaja Lin,et al.  Decidable models of integer-manipulating programs with recursive parallelism , 2018, Theor. Comput. Sci..

[21]  Amelia C. Regan,et al.  A Parallel Genetic Algorithm Framework for Transportation Planning and Logistics Management , 2020, IEEE Access.

[22]  Bin Liu,et al.  Polygonal approximation based on coarse-grained parallel genetic algorithm , 2020, J. Vis. Commun. Image Represent..

[23]  Yuxiang Li,et al.  TPAoPI:A Thread Partitioning Approach Based on Procedure Importance in Speculative Multithreading , 2019, 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[24]  Beata Bylina,et al.  The Parallel Tiled WZ Factorization Algorithm for Multicore Architectures , 2019, Int. J. Appl. Math. Comput. Sci..

[25]  Jinglu Hu,et al.  Solving the dynamic energy aware job shop scheduling problem with the heterogeneous parallel genetic algorithm , 2020, Future Gener. Comput. Syst..

[26]  Mohammad R. Khosravi,et al.  An efficient parallel genetic algorithm solution for vehicle routing problem in cloud implementation of the intelligent transportation systems , 2020, J. Cloud Comput..

[27]  Beatriz Otero,et al.  Alternating direction implicit time integrations for finite difference acoustic wave propagation: Parallelization and convergence , 2020, ArXiv.

[28]  Rajeev Barua,et al.  POSTER: Easy PRAM-based High-Performance Parallel Programming with ICE , 2018, PACT.

[29]  Widyaning Chandramitasari,et al.  Triple‐chromosome genetic algorithm for unrelated parallel machine scheduling under time‐of‐use tariffs , 2019, IEEJ Transactions on Electrical and Electronic Engineering.

[30]  Antonia Zhai,et al.  Exploring speculative parallelism in SPEC2006 , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[31]  Bouroubi Sadek,et al.  Cryptanalysis of Merkle-Hellman cipher using parallel genetic algorithm , 2020, Mob. Networks Appl..

[32]  Zhoukai Wang,et al.  A Speculative Parallel Optimization Method for Industrial Big Data Algorithms , 2019, 2019 IEEE International Conference on Industrial Internet (ICII).

[33]  Saroja Subbaraj,et al.  Hybrid dual-objective parallel genetic algorithm for heterogeneous multiprocessor scheduling , 2020, Clust. Comput..

[34]  L. MacGregor,et al.  Inversion of marine controlled source electromagnetic data using a parallel non-dominated sorting genetic algorithm , 2020, Geophysical Journal International.

[35]  T. Revathi,et al.  Hybrid dual-objective parallel genetic algorithm for heterogeneous multiprocessor scheduling , 2019, Cluster Computing.

[36]  Luca di Mare,et al.  Some useful optimisations for unstructured computational fluid dynamics codes on multicore and manycore architectures , 2019, Comput. Phys. Commun..

[37]  Yinliang Zhao,et al.  A hybrid sample generation approach in speculative multithreading , 2019, The Journal of Supercomputing.

[38]  Wang Yu-xin,et al.  Parallel genetic algorithm for N‐Queens problem based on message passing interface‐compute unified device architecture , 2020, Comput. Intell..

[39]  You Tao,et al.  A Static Greedy and Dynamic Adaptive Thread Spawning Approach for Loop-Level Parallelism , 2014, Journal of Computer Science and Technology.

[40]  Sébastien Varrette,et al.  Hybrid MPI+openMP Implementation of eXtended Discrete Element Method , 2018, 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD).

[41]  Mahmood Fazlali,et al.  Scalable Parallel Genetic Algorithm For Solving Large Integer Linear Programming Models Derived From Behavioral Synthesis , 2020, 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP).

[42]  Ninghui Sun,et al.  QoSMT: supporting precise performance control for simultaneous multithreading architecture , 2019, ICS.

[43]  Mohammad Hossein Refan,et al.  Analysis of Parallel Genetic Algorithm and Parallel Particle Swarm Optimization Algorithm UAV Path Planning on Controller Area Network , 2020 .

[44]  Emanuele Catalano,et al.  Accelerating Yade's poromechanical coupling with matrix factorization reuse, parallel task management, and GPU computing , 2020, Comput. Phys. Commun..

[45]  Jaeyoung Choi,et al.  Auto-tuning GEMM kernels on the Intel KNL and Intel Skylake-SP processors , 2018, The Journal of Supercomputing.

[46]  C. Lim,et al.  A cost-driven compilation framework for speculative parallelization of sequential programs , 2004, PLDI.

[47]  Xiuhong Li,et al.  Optimised memory allocation for less false abortion and better performance in hardware transactional memory , 2020, Int. J. Parallel Emergent Distributed Syst..

[48]  Christopher R. Stephens,et al.  "Optimal" mutation rates for genetic search , 2006, GECCO.

[49]  Anthony Widjaja Lin,et al.  Decidable Models of Integer-Manipulating Programs with Recursive Parallelism , 2016, RP.

[50]  Pascal Felber,et al.  Extending hardware transactional memory capacity via rollback-only transactions and suspend/resume , 2019, Distributed Computing.

[51]  Song Liu,et al.  A Dynamic Parallel Strategy for DOACROSS Loops , 2018, HPC Asia.