Ensuring Long-Term Access to Remotely Sensed Data With Layout Maps

The Hierarchical Data Format (HDF) has been a data format standard in National Aeronautic and Space Administration (NASA)'s Earth Observing System Data and Information System since the 1990s. Its rich structure, platform independence, full-featured application programming interface (API), and internal compression make it very useful for archiving science data and utilizing them with a rich set of software tools. However, a key drawback for long-term archiving is the complex internal byte layout of HDF files, requiring one to use the API to access HDF data. This makes the long-term readability of HDF data for a given version dependent on long-term allocation of resources to support that version. Much of the data from NASA's Earth Observing System have been archived in HDF Version 4 (HDF4) format. To address the long-term archival issues for these data, a collaborative study between The HDF Group and NASA's Earth Science Data Centers (ESDCs) is underway. One of the first activities was an assessment of the range of HDF4-formatted data held by NASA to determine the capabilities inherent in the HDF format that were used in practice and for use in estimating the effort for full implementation across NASA's ESDCs. Based on the results of this assessment, methods for producing a map of the layout of the HDF4 files held by NASA were prototyped using a markup-language-based HDF tool. The resulting maps allow a separate program to read the file without recourse to the HDF API. To verify this, two independent tools based solely on the map files were developed and tested.