The dangers of heterogeneous network computing: heterogeneous networks considered harmful

This report addresses the issue of writing reliable numerical software for networks of heterogeneous computers. Much software has been written for distributed memory parallel computers and in principal such software could readily be ported to networks of machines, such as a collection of workstations connected by Ethernet, but if such a network is not homogeneous there are special challenges that need to be addressed. The symptoms can range from erroneous results returned without warning to deadlock. Some of the problems are straightforward to solve, but for others the solutions are not so obvious and indeed in some cases, such as the method of bisection which we shall discuss in the report, we have not yet decided upon a satisfactory solution that does not incur an unacceptable overhead. Making software robust on heterogeneous systems often requires additional communication. In this report we describe and illustrate the problems and, where possible, suggest solutions so that others may be aware of the potential pitfalls and either avoid them or, if that is not possible, ensure that their software is not used on heterogeneous networks.