A framework for reading and unifying heliophysics time series data

We describe a framework designed to simplify the acquisition and integration of data from multiple, diversely formatted, geographically distributed science data sets. Our domain is Heliophysics where measurements of magnetic fields, plasmas, and charged particles are often made in-situ, with the data made available in relatively low volume data sets consisting of time series tables. Data format diversity has proven to be a significant barrier to the type of integrated, multi-mission analysis that is now very important in Heliophysics. Therefore we have developed a Java framework capable of reading, interpreting, and providing uniform access to the science content of any distributed time series data set. The framework exposes data only through fully abstract interfaces that represent data content while hiding all access details such as file format, data file granularity and access protocols. Furthermore, specialized interfaces for representing measurement-specific details are also employed, so that our framework enables data sets to be recast into scientifically interoperable representations. The context of our efforts is an increasingly distributed Heliophysics data environment that employs a collection of discipline-specific Virtual Observatories (VOs), each providing data search and retrieval services for one Heliophysics sub-discipline. Our framework is bundled in a library that ultimately will serve as a universal reader for Heliophysics data, solving the formats problem and serving as key infrastructure for advanced, science-sensitive data manipulation services.