A speculative parallel simulated annealing algorithm based on Apache Spark

Simulated annealing (SA) is an effective method for solving unconstrained optimization problems and has been widely used in machine learning and neural network. Nowadays, in order to optimize complex problems with big data, the SA algorithm has been implemented on big data platform and obtains a certain speedup. However, the efficiency for such implementation is still limited because the conventional SA algorithm still runs with low parallelism on new platforms and the computing resource cannot be fully utilized. For these problems, this paper raised a speculative parallel SA algorithm based on Apache Spark to expand the algorithm's parallelism and enhance its efficiency. In this paper, first, the inner dependencies, which stop conventional algorithm, run in parallel, are analyzed. Then, based on the analysis, the Software Thread‐Level Speculation technique is employed to help the conventional algorithm overcome the dependencies and make it run concurrently. Finally, a new parallel SA algorithm with speculation mechanism is proposed and implemented on Apache Spark. The experiments show that, for big data problems, the proposed algorithm could achieve an optimal parallelism when comparing the traditional algorithm without speculation on Apache Spark. Moreover, the execution efficiency of simulated annealing process can be markedly enhanced by the proposed algorithm.

[1]  Ping-Feng Pai,et al.  Software reliability forecasting by support vector machines with simulated annealing algorithms , 2006, J. Syst. Softw..

[2]  Zhong Chen,et al.  A speculative parallel decompression algorithm on Apache Spark , 2017, The Journal of Supercomputing.

[3]  Boqin Feng,et al.  A thread partitioning approach for speculative multithreading , 2013, The Journal of Supercomputing.

[4]  Keld Helsgaun,et al.  An effective implementation of the Lin-Kernighan traveling salesman heuristic , 2000, Eur. J. Oper. Res..

[5]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[6]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[7]  Kai Zhao,et al.  Solving the traveling salesman problem based on an adaptive simulated annealing algorithm with greedy search , 2011, Appl. Soft Comput..

[8]  Peter Rossmanith,et al.  Simulated Annealing , 2008, Taschenbuch der Algorithmen.

[9]  Lester Ingber,et al.  Adaptive Simulated Annealing , 2012 .

[10]  Gerhard Reinelt,et al.  TSPLIB - A Traveling Salesman Problem Library , 1991, INFORMS J. Comput..

[11]  Xin-She Yang,et al.  Discrete cuckoo search algorithm for the travelling salesman problem , 2014, Neural Computing and Applications.

[12]  Mark A. Franklin,et al.  Parallel Simulated Annealing using Speculative Computation , 1991, IEEE Trans. Parallel Distributed Syst..

[13]  Mariane R. Petraglia,et al.  Stochastic Global Optimization and Its Applications with Fuzzy Adaptive Simulated Annealing , 2012, Intelligent Systems Reference Library.

[14]  Tariq Rahim Soomro,et al.  Big Data Analysis: Apache Spark Perspective , 2015 .

[15]  Zhoukai Wang,et al.  A Speculative Parallel Intrusion Prevention System Based on Apache Spark , 2017, 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC).

[16]  Yinliang Zhao,et al.  A hybrid sample generation approach in speculative multithreading , 2019, The Journal of Supercomputing.

[17]  Nancy M. Amato,et al.  Spark PRM: Using RRTs within PRMs to efficiently explore narrow passages , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Zhen Cao,et al.  Mixed Model Universal Software Thread-Level Speculation , 2013, 2013 42nd International Conference on Parallel Processing.

[19]  Yinliang Zhao,et al.  A Hybrid Samples Generation Approach in Speculative Multithreading , 2016, 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[20]  Zhi-Jian Wang,et al.  A Parallel Genetic Algorithm Based on Spark for Pairwise Test Suite Generation , 2016, Journal of Computer Science and Technology.

[21]  Rong Gu,et al.  YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[22]  Yong Wang,et al.  An Improved Simulated Annealing Algorithm for Traveling Salesman Problem , 2013 .