Analyzing the Efficiency of Block-Cyclic Checkerboard Partitioning in Neville Elimination

In this paper we analyze the performance of the Neville method when a block-cyclic checkerboard partitioning is used. This partitioning can exploit more concurrency than the striped method because the matrix computation can be divided out among more processors than in the case of striping. Concretely, it divides the matrix into blocks and maps them in a cyclic way among the processors. The performance of this parallel system is measured in terms of efficiency, which in this case is close to one when the optimum block size is used and it is run on a Parallel PC Cluster.