Performance analysis of parallel constraint-based local search

We present a parallel implementation of a constraint-based local search algorithm and investigate its performance results for hard combinatorial optimization problems on two different platforms up to several hundreds of cores. On a variety of classical CSPs benchmarks, speedups are very good for a few tens of cores, and good up to a hundred cores. More challenging problems derived from reallife applications (Costas array) shows even better speedups, nearly optimal up to 256 cores.