An efficient GPU-based parallel tabu search algorithm for hardware/software co-design

Hardware/software partitioning is an essential step in hardware/software co-design. For large size problems, it is difficult to consider both solution quality and time. This paper presents an efficient GPU-based parallel tabu search algorithm (GPTS) for HW/SW partitioning. A single GPU kernel of compacting neighborhood is proposed to reduce the amount of GPU global memory accesses theoretically. A kernel fusion strategy is further proposed to reduce the amount of GPU global memory accesses of GPTS. To further minimize the transfer overhead of GPTS between CPU and GPU, an optimized transfer strategy for GPU-based tabu evaluation is proposed, which considers that all the candidates do not satisfy the given constraint. Experiments show that GPTS outperforms state-of-the-art work of tabu search and is competitive with other methods for HW/SW partitioning. The proposed parallelization is significant when considering the ordinary GPU platform.

[1]  Jia-shi Yong,et al.  A Novel Bat Algorithm based on Cross Boundary Learning and Uniform Explosion Strategy , 2019 .

[2]  Yiteng Pan,et al.  A novel Enhanced Collaborative Autoencoder with knowledge distillation for top-N recommender systems , 2019, Neurocomputing.

[3]  Fazhi He,et al.  A Novel Hardware/Software Partitioning Method Based on Position Disturbed Particle Swarm Optimization with Invasive Weed Optimization , 2017, Journal of Computer Science and Technology.

[4]  Cong Wang,et al.  HARDWARE/SOFTWARE PARTITIONING ALGORITHM BASED ON THE COMBINATION OF GENETIC ALGORITHM AND TABU SEARCH , 2014 .

[5]  Yi Zhou,et al.  Dynamic strategy based parallel ant colony optimization on GPUs for TSPs , 2017, Science China Information Sciences.

[6]  Fazhi He,et al.  IBEA-SVM: An Indicator-based Evolutionary Algorithm Based on Pre-selection with Classification Guided by SVM , 2019, Applied Mathematics-A Journal of Chinese Universities.

[7]  Yuan Cheng,et al.  An optimized RGA supporting selective undo for collaborative text editing systems , 2019, J. Parallel Distributed Comput..

[8]  P. Arato,et al.  Hardware-software partitioning in embedded system design , 2003, IEEE International Symposium on Intelligent Signal Processing, 2003.

[9]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[10]  Zoltán Ádám Mann,et al.  Algorithmic aspects of hardware/software partitioning , 2005, TODE.

[11]  Yiteng Pan,et al.  A novel segmentation model for medical images with intensity inhomogeneity based on adaptive perturbation , 2018, Multimedia Tools and Applications.

[12]  Wu Jigang,et al.  Algorithmic Aspects of Hardware/Software Partitioning: 1D Search Algorithms , 2010, IEEE Transactions on Computers.

[13]  Přemysl Šůcha,et al.  Solving the Resource Constrained Project Scheduling Problem Using the Parallel Tabu Search Designed for the CUDA Platform , 2015, J. Parallel Distributed Comput..

[14]  Mohamed B. Abdelhalim,et al.  An integrated high-level hardware/software partitioning methodology , 2011, Des. Autom. Embed. Syst..

[15]  Mehdi Kamal,et al.  Parallel-Genetic-Algorithm-Based HW/SW Partitioning , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).

[16]  Chao-Chin Wu,et al.  Reconstructing permutation table to improve the Tabu Search for the PFSP on GPU , 2017, The Journal of Supercomputing.

[17]  Chunxia Xiao,et al.  Narrative Collage of Image Collections by Scene Graph Recombination , 2018, IEEE Transactions on Visualization and Computer Graphics.

[18]  Jörg Henkel,et al.  An approach to automated hardware/software partitioning using a flexible granularity that is driven by high-level estimation techniques , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[19]  Wu Jigang,et al.  Algorithms for bi-objective multiple-choice hardware/software partitioning , 2016, Comput. Electr. Eng..

[20]  Weihang Zhu,et al.  SIMD tabu search for the quadratic assignment problem with graphics hardware acceleration , 2010 .

[21]  Fazhi He,et al.  An efficient and robust bat algorithm with fusion of opposition-based learning and whale optimization algorithm , 2020, Intell. Data Anal..

[22]  Pier Luca Lanzi,et al.  Ant Colony Optimization for mapping, scheduling and placing in reconfigurable systems , 2013, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013).

[23]  Jürgen Teich,et al.  Hardware/Software Codesign: The Past, the Present, and Predicting the Future , 2012, Proceedings of the IEEE.

[24]  Giovanni De Micheli,et al.  Hardware-software cosynthesis for digital systems , 1993, IEEE Design & Test of Computers.

[25]  Yi Zhou,et al.  Parallel ant colony optimization on multi-core SIMD CPUs , 2018, Future Gener. Comput. Syst..

[26]  Fazhi He,et al.  A survey on partitioning models, solution algorithms and algorithm parallelization for hardware/software co-design , 2019, Des. Autom. Embed. Syst..

[27]  Jizhou Sun,et al.  Algorithmic aspects of graph reduction for hardware/software partitioning , 2015, The Journal of Supercomputing.

[28]  Yi Zhou,et al.  A GPU-based tabu search for very large hardware/software partitioning with limited resource usage , 2017 .

[29]  Jing Liu,et al.  Hardware/Software Partitioning for Heterogenous MPSoC Considering Communication Overhead , 2017, International Journal of Parallel Programming.

[30]  Bin Li,et al.  A hardware/software partitioning algorithm based on artificial immune principles , 2008, Appl. Soft Comput..

[31]  Jörg Henkel,et al.  Hardware-software cosynthesis for microcontrollers , 1993, IEEE Design & Test of Computers.

[32]  Jian Yao,et al.  Joint learning of image detail and transmission map for single image dehazing , 2018, The Visual Computer.

[33]  El-Ghazali Talbi,et al.  GPU Computing for Parallel Local Search Metaheuristic Algorithms , 2013, IEEE Transactions on Computers.

[34]  Fazhi He,et al.  An Efficient Particle Swarm Optimization for Large-Scale Hardware/Software Co-Design System , 2017, Int. J. Cooperative Inf. Syst..

[35]  Wayne H. Wolf,et al.  TGFF: task graphs for free , 1998, Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE'98).

[36]  Giovanni De Micheli,et al.  Hardware-software Co-synthesis for Digital Systems , 2001 .

[37]  Yiteng Pan,et al.  Learning adaptive trust strength with user roles of truster and trustee for trust-aware recommender systems , 2019, Applied Intelligence.

[38]  Lucas C. Cordeiro,et al.  Applying SMT-based verification to hardware/software partitioning in embedded systems , 2016, Des. Autom. Embed. Syst..

[39]  Jeff A. Stuart,et al.  A study of Persistent Threads style GPU programming for GPGPU workloads , 2012, 2012 Innovative Parallel Computing (InPar).

[40]  Mouloud Koudil,et al.  Using artificial bees to solve partitioning and scheduling problems in codesign , 2007, Appl. Math. Comput..

[41]  Wu Jigang,et al.  Efficient heuristic and tabu search for hardware/software partitioning , 2013, The Journal of Supercomputing.

[42]  Thambipillai Srikanthan,et al.  KnapSim - Run-time efficient hardware-software partitioning technique for FPGAs , 2015, 2015 28th IEEE International System-on-Chip Conference (SOCC).

[43]  Fazhi He,et al.  Service-Oriented Feature-Based Data Exchange for Cloud-Based Design and Manufacturing , 2018, IEEE Transactions on Services Computing.

[44]  Yiteng Pan,et al.  A correlative denoising autoencoder to model social influence for top-N recommender system , 2019, Frontiers of Computer Science.

[45]  Xiao Chen,et al.  A matting method based on full feature coverage , 2018, Multimedia Tools and Applications.

[46]  Rajesh Gupta,et al.  Hardware/software co-design , 1996, Proc. IEEE.

[47]  Ulf Assarsson,et al.  Efficient stream compaction on wide SIMD many-core architectures , 2009, High Performance Graphics.

[48]  Yiteng Pan,et al.  A novel region-based active contour model via local patch similarity measure for image segmentation , 2018, Multimedia Tools and Applications.

[49]  P. A. Subrahmanyam,et al.  Hardware/software partitioning for multifunction systems , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[50]  Hao Zhang,et al.  Research on Parallel HW/SW Partitioning Based on Hybrid PSO Algorithm , 2009, ICA3PP.

[51]  Chunxia Xiao,et al.  Surface Reconstruction via Fusing Sparse-Sequence of Depth Images , 2018, IEEE Transactions on Visualization and Computer Graphics.

[52]  Yi Zhou,et al.  An adaptive neighborhood taboo search on GPU for Hardware/Software Co-design , 2016, 2016 IEEE 20th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[53]  P. A. Subrahmanyam,et al.  Hardware/software partitioning for multi-function systems , 1997, 1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD).

[54]  Kang Li,et al.  Robust Visual Tracking Based on Convolutional Features with Illumination and Occlusion Handing , 2018, Journal of Computer Science and Technology.

[55]  Niraj K. Jha,et al.  MOGAC: a multiobjective genetic algorithm for hardware-software cosynthesis of distributed embedded systems , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[56]  Wu Jigang,et al.  NodeRank:An Efficient Algorithm for Hardware/Software Partitioning , 2014 .

[57]  Nicholas Wilt,et al.  The CUDA Handbook: A Comprehensive Guide to GPU Programming , 2013 .

[58]  Wayne H. Wolf A Decade of Hardware/Software Codesign , 2003, Computer.

[59]  Yuan Cheng,et al.  Integrating selective undo of feature-based modeling operations for real-time collaborative CAD systems , 2019, Future Gener. Comput. Syst..

[60]  Yu Jiang,et al.  Uncertain Model and Algorithm for Hardware/Software Partitioning , 2012, 2012 IEEE Computer Society Annual Symposium on VLSI.

[61]  Xiao Chen,et al.  A parallel and robust object tracking approach synthesizing adaptive Bayesian learning and improved incremental subspace learning , 2019, Frontiers of Computer Science.

[62]  Tao Zhang,et al.  Comments on “Algorithmic Aspectsof Hardware/Software Partitioning:1D Search Algorithms” , 2014, IEEE Transactions on Computers.

[63]  Guowu Yang,et al.  Uncertainty Model for Configurable Hardware/Software and Resource Partitioning , 2016, IEEE Transactions on Computers.

[64]  Shubhajit Roy Chowdhury,et al.  PGMA: An algorithmic approach for multi-objective hardware software partitioning , 2017, Microprocess. Microsystems.

[65]  Gang Wang,et al.  Application partitioning on programmable platforms using the ant colony optimization , 2006, J. Embed. Comput..