Hancock: a language for processing very large-scale data

A signature is an evolving customer profile computed from call records. AT&T uses signatures to detect fraud and to target marketing. Code to compute signatures can be difficult to write and maintain because of the volume of data. We have designed and implemented Hancock, a C-based domain- specific programming language for describing signatures. Hancock provides data abstraction mechanisms to manage the volume of data and control abstractions to facilitate looping over records. This paper describes the design and implementation of Hancock, discusses early experiences with the language, and describes our design process.