Operator-Level GPU-Accelerated Branch and Bound Algorithms

Abstract Branch-and-Bound (B&B) algorithms are well-known tree-based exploratory methods for solving to optimality NP-hard discrete optimization problems. The construction of the B&B tree and its exploration are performed using four operators: branching, bounding, selection and pruning. Such algorithms are irregular which makes challenging their parallel design and implementation on GPU accelerators. Among the few existing related works, we have recently revisited on GPU the bounding operator. The reported results show that speedups up to × 100 can be obtained on recent GPU cards. In this paper, we address the GPU-based design and implementation of B&B algorithms considering the branching and pruning operators as well as the bounding one. The proposed template transforms the unpredictable and irregular workload associated to the explored B&B tree into regular data-parallel kernels optimized for the SIMD-based execution model of GPUs. Thread divergence and uncoalesced memory accesses are considered in the optimization process. The proposed approach has been experimented on the Flow-Shop scheduling problem and compared to another GPU-based strategy and to a cluster of workstations (COWs) based approach. The reported results demonstrate the efficiency of the proposed approach over the two other ones. Speedups up to × 160 are obtained for large problem instances using an Nvidia Tesla C2050 hardware configuration.

[1]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[2]  Didier El Baz,et al.  GPU Implementation of the Branch and Bound Method for Knapsack Problems , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[3]  Bernard Gendron,et al.  Parallel Branch-and-Branch Algorithms: Survey and Synthesis , 1994, Oper. Res..

[4]  Imen Chakroun,et al.  Reducing thread divergence in a GPU‐accelerated branch‐and‐bound algorithm , 2013, Concurr. Comput. Pract. Exp..

[5]  Imen Chakroun,et al.  An Adaptative Multi-GPU Based Branch-and-Bound. A Case Study: The Flow-Shop Scheduling Problem , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.

[6]  Teodor Gabriel Crainic,et al.  PARALLEL BRANCH-AND-BOUND ALGORITHMS: SURVEY AND SYNTHESIS , 1993 .

[7]  El-Ghazali Talbi,et al.  A Grid-enabled Branch and Bound Algorithm for Solving Challenging Combinatorial Optimization Problems , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[8]  S. M. Johnson,et al.  Optimal two- and three-stage production schedules with setup times included , 1954 .

[9]  Gustavo Augusto Lima de Campos,et al.  A New Parallel Schema for Branch-and-Bound Algorithms Using GPGPU , 2011, 2011 23rd International Symposium on Computer Architecture and High Performance Computing.

[10]  David A. Bader,et al.  Parallel Algorithm Design for Branch and Bound , 2005 .

[11]  Maciej Drozdowski,et al.  Grid Branch-and-Bound for Permutation Flowshop , 2011, PPAM.