Efficient parallel solutions to the integral knapsack problem on current chip-multiprocessor systems

The emergence of chip-multiprocessor systems has dramatically increased the performance potential of computer systems. However, harnessing the full potential of these systems depends largely on the effectiveness of system software, such as compilers, in exploiting the on-chip parallelism. Additionally, since the amount of parallelism extracted by a compiler is directly influenced by the selection of the algorithm, algorithmic choice also plays a critical role in achieving a high fraction of peak performance. Hence, in the era of multicore computing, it is imperative that we re-evaluate and rethink algorithms for key problem domains. This paper investigates the impact of algorithmic choice on the performance of parallel implementations of the integral knapsack problem on multicore architectures. The study considers two classes of algorithms and several algorithmic variants and evaluates each implementation based on a variety of performance metrics including data locality and sharing, granularity of parallelism and scalability. The paper presents experimental results that show how each performance factor is affected by the selection of algorithm, changes in the input data-set and variations in architectural characteristics such as cache capacity and degree of cache sharing.

[1]  Stephen F. Jenks,et al.  The Synchronized Pipelined Parallelism Model , 2004 .

[2]  David Pisinger,et al.  Core Problems in Knapsack Algorithms , 1999, Oper. Res..

[3]  Rumen Andonov,et al.  A hybrid algorithm for the unbounded knapsack problem , 2009, Discret. Optim..

[4]  Fabrizio Petrini,et al.  Challenges in Mapping Graph Exploration Algorithms on Advanced Multi-core Processors , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[5]  R. Gomory,et al.  A Linear Programming Approach to the Cutting-Stock Problem , 1961 .

[6]  Fabrizio Petrini,et al.  Efficient Breadth-First Search on the Cell/BE Processor , 2008, IEEE Transactions on Parallel and Distributed Systems.

[7]  PetriniFabrizio,et al.  Efficient Breadth-First Search on the Cell/BE Processor , 2008 .

[8]  Paolo Toth,et al.  Knapsack Problems: Algorithms and Computer Implementations , 1990 .

[9]  Nathan R. Tallent,et al.  HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..

[10]  Ralph E. Gomory,et al.  A Linear Programming Approach to the Cutting Stock Problem---Part II , 1963 .

[11]  R. Gomory,et al.  Multistage Cutting Stock Problems of Two and More Dimensions , 1965 .

[12]  Robert A. van de Geijn,et al.  The science of deriving dense linear algebra algorithms , 2005, TOMS.

[13]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[14]  P. B. Coaker,et al.  Applied Dynamic Programming , 1964 .

[15]  Sanjay V. Rajopadhye,et al.  Unbounded knapsack problem: Dynamic programming revisited , 2000, Eur. J. Oper. Res..

[16]  Denis Trystram,et al.  An efficient parallel algorithm for solving the Knapsack problem on hypercubes , 2004, J. Parallel Distributed Comput..

[17]  Guang R. Gao,et al.  A parallel dynamic programming algorithm on a multi-core architecture , 2007, SPAA '07.

[18]  José Luis Roda García,et al.  Integral knapsack problems: parallel algorithms and their implementations on distributed systems , 1995, ICS '95.

[19]  Gerard J. Holzmann A Stack-Slicing Algorithm for Multi-Core Model Checking , 2008, Electron. Notes Theor. Comput. Sci..

[20]  Patrice Quinton,et al.  Dynamic programming parallel implementations for the knapsack problem , 1993 .

[21]  Paolo Toth,et al.  New trends in exact algorithms for the 0-1 knapsack problem , 2000, Eur. J. Oper. Res..

[22]  G. Dantzig Discrete-Variable Extremum Problems , 1957 .

[23]  David A. Padua,et al.  SPL: a language and compiler for DSP algorithms , 2001, PLDI '01.

[24]  Matteo Frigo,et al.  A fast Fourier transform compiler , 1999, SIGP.