On the problem of optimizing data transfers for complex memory systems

Parallel supercomputers architectures with complex memory hierarchies or distributed memory systems have become very common. Unfortunately, the problems associated with restructuring software to take advantage of these memory systems are not easily solved. This paper presents an overview of some of the mathematical issues behind several of these problems and attempts to give a brief look at some of the potential solutions.