Handling errors in parallel programs based on happens before relations

Intervals are a new model for parallel programming based on an explicit happens before relation. Intervals permit fine-grained but high-level control of the program scheduler, and they dynamically detect and prevent deadlocking schedules. In this paper, we discuss the design decisions that led to the intervals model, focusing on error detection and handling. Our error propagation scheme makes use of the happens before relation to detect and abort dependent tasks that occur between the point where a failure occurs and where the failure is handled.