Efficient exploitation of the Xeon Phi architecture for the Ant Colony Optimization (ACO) metaheuristic

In recent years, the use of compute-intensive coprocessors has been widely studied in the field of Parallel Computing to accelerate sequential processes through a Graphic Processing Unit (GPU). Intel has recently released a GPU-type coprocessor, the Intel Xeon Phi. It is composed up to 72 cores connected by a bidirectional ring network with a Vector Process Unit (VPU) on large vector registers. In this work, we present novel parallel algorithms of the well-known Ant Colony Optimization (ACO) on the recent many-core platform Intel Xeon Phi coprocessor. ACO is a popular metaheuristic algorithm applied to a wide range of NP-hard problems. To show the efficiency of our approaches, we test our algorithms solving the Traveling Salesman Problem. Our results confirm the potential of our proposed algorithms which led to distinct improvements of performance over previous state-of-the-art approaches in GPU. We implement and compare a set of algorithms to deal with the different steps of ACO. The matrices calculation in the proposed algorithms efficiently exploit the VPU and cache in Xeon Phi. We also show a novel implementation of the roulette wheel selection algorithm, named as UV-Roulette (unique random value roulette). We compare our results in Xeon Phi to state-of-the-art GPU methods, achieving higher performance with large size problems. We also exposed the difficulties and key hardware performance factors to deal with the ACO algorithm on a Xeon Phi coprocessor.

[1]  Jie Cheng,et al.  Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..

[2]  Thomas Stützle,et al.  Ant Colony Optimization Theory , 2004 .

[3]  Marc Gravel,et al.  Parallel Ant Colony Optimization on Graphics Processing Units , 2013, J. Parallel Distributed Comput..

[4]  James Reinders,et al.  Intel Xeon Phi Coprocessor High Performance Programming , 2013 .

[5]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[6]  Martyn Amos,et al.  Enhancing data parallelism for Ant Colony Optimization on GPUs , 2013, J. Parallel Distributed Comput..

[7]  Ricardo J. Barrientos,et al.  Using a coprocessor to solve the Ant Colony Optimization algorithm , 2015, 2015 34th International Conference of the Chilean Computer Science Society (SCCC).

[8]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[9]  Iain A. Stewart,et al.  Improving Ant Colony Optimization performance on the GPU using CUDA , 2013, 2013 IEEE Congress on Evolutionary Computation.

[10]  Qing Zhang,et al.  High-Performance Computing on the Intel® Xeon Phi™ , 2014, Springer International Publishing.

[11]  Eugene L. Lawler,et al.  The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization , 1985 .

[12]  Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2013, Cancun, Mexico, June 20-23, 2013 , 2013, IEEE Congress on Evolutionary Computation.

[13]  E. Lawler,et al.  Erratum: The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization , 1986 .

[14]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[15]  Thomas Stützle,et al.  Ant colony optimization: artificial ants as a computational intelligence technique , 2006 .

[16]  Marco Dorigo,et al.  The ant colony optimization meta-heuristic , 1999 .

[17]  Yuji Sato,et al.  First results of performance comparisons on many-core processors in solving QAP with ACO: kepler GPU versus xeon PHI , 2014, GECCO.

[18]  Luca Maria Gambardella,et al.  Ant Algorithms for Discrete Optimization , 1999, Artificial Life.