Linear optimization - A case study in performance analysis

The paper deals with the performance of two parallel variants of the simplex algorithm on a message-passing system. First, the simplex algorithm is reviewed, two possible parallelizations of the algorithm are discussed, and results of benchmark speedups of the alternatives are presented. Between column and row partitionings, the row partitioning method is found to be generally superior, while the column partitioning method is more efficient when the number of rows is small, and the number of columns is much greater that the number of rows. Various performance analysis tools are then applied to examine the reasons for relative performance differences, and communication idle time due to global minimization and load imbalances is noted as the main factor in execution slowdown.