Targeting multi-core architectures for linear algebra applications
暂无分享,去创建一个
We are on the verge of a paradigm shift with our software for the new multicore architectures and there is no free lunch for conventional software. Power consumption and heat dissipation issues are pushing the microprocessor industry towards multicore design patterns. With the number of cores on multicore chips expected to reach tens to perhaps hundreds in a few years, efficient implementations of numerical libraries using shared memory programming models is of high interest. The current message passing paradigm used in ScaLAPACK and elsewhere introduces unnecessary memory overhead and memory copy operations, which degrade performance, along with making it harder to schedule operations that could be done in parallel. Limiting the use of shared memory to fork-join parallelism (perhaps with OpenMP) or to focusing the parallelism within the BLAS does not address all these issues.