Data Cyberinfrastructure for End-to-End Science

Large-scale scientific facilities provide a broad community of researchers and educators with open access to instrumentation and data products generated from geographically distributed instruments and sensors. This paper discusses key architectural design, deployment, and operational aspects of a production cyberinfrastructure for the acquisition, processing, and delivery of data from large scientific facilities using experiences from the National Science Foundation's Ocean Observatories Initiative. This paper also outlines new models for data delivery and opportunities for insights in a wide range of scientific and engineering domains as the volumes and variety of data from facilities grow.