Modeling Communication in Cluster Computing

We introduce a model for communication costs in parallel processing environments, called the {open_quotes}hyperbolic model,{close_quotes} that generalizes two-parameter dedicated-link models in an analytically simple way. The communication system is modeled as a directed communication graph in which terminal nodes represent the application processes and internal nodes, called communication blocks (CBs), reflect the layered structure of the underlying communication architecture. A CB is characterized by a two-parameter hyperbolic function of the message size that represents the service time needed for processing the message. Rules are given for reducing a communication graph consisting of many CBs to an equivalent two-parameter form, while maintaining a good approximation for the service time. In [4] we demonstrate a tight fit of the estimates of the cost of communication based on our model with actual measurements of the communication and synchronization time between end processes. We, also show the compatibility of our model (to within a factor of 3/4) with the recently proposed LogP model.