Skywriting on CIEL: Programming the Data Center

PROGRAMMING How do you write a program that runs across hundreds or thousands of computers? As data sources have proliferated, this question has spread beyond the domain of a few large search engines to become a concern for a variety of online services, large corporations, and academic researchers. This article introduces Skywriting and CIEL, which are, respectively, a programming language and a system designed to run a large class of algorithms on a commodity cluster. Large-scale data processing requires parallelism, and useful parallelism requires coordination between processes. In a shared-memory system, coordination might involve updating a shared variable or signaling a condition variable; in a distributed system there is no shared memory, so processes communicate by sending messages. However, explicit message passing (using network sockets or a library such as MPI) is ill-suited to large commodity clusters, because it requires the programmer to specify the recipient host for every message. In these clusters, machines often go offline due to failure or planned maintenance, or they may be reassigned to another user. In general, cluster membership is far more dynamic than the supercomputers for which explicit message-passing libraries were first developed, and maintaining cluster membership information manually is a challenging distributed consensus problem. The challenges of programming in a distributed system have led to the rise of distributed execution engines. These systems also send messages internally, but they virtualize the cluster resources beneath a high-level programming model. In 2004, Google announced MapReduce, which requires the implementation of just two functions—map() and reduce()—and frees the developer from having to implement parallel algorithms, distributed synchronization, task scheduling, or fault tolerance. Hadoop (an open-source implementation of MapReduce) was released soon afterwards, and has become widely used in many organizations, including Amazon, eBay, Facebook, and Twitter. In 2007, Microsoft published Dryad, which is a generalization of MapReduce that supports a broader class of algorithms, including relational-style queries with joins and multiple stages. Most distributed execution engines divide computations into tasks, which are atomic and deterministic fragments of code that run on a single host. The power of an execution engine comes from its ability to track dependencies between tasks, and hence coordinate their execution and data flow. In increasing order of power, these dependency structures include: leads the CIEL project, which forms the basis of his thesis research into expressive programming models for distributed computation. At other points, his research interests have included OS virtualization, …