A Science Data System Architecture for Information Retrieval

Science research generates an enormous amount of data that is located in geographically distributed data repositories. The data generated by these efforts are often captured and managed without reference to any standard principles of information architecture. Interoperability and efficient search and retrieval of data products across disparate data systems is difficult because users are often required to connect to each individual data system and deal with dissimilar and often unfamiliar interfaces and semantics. It makes the development of software systems that work across organizational and disciplinary boundaries challenging if the organizing principles that construct the information architecture are not explicitly defined. Clustering data results across multiple information systems is challenging without a system architecture that provides both the data and distributed systems architecture and standards.