Enhancing the impact of science data toward data discovery and reuse

The a mount of data produced in support of scientific research continues to grow rapidly. Despite the accumulation and demand for scientific data, relatively little data are actually made available for the broader scientific community. We surmise that one root of this problem is the perceived difficulty of electronically publishing scientific data and associated metadata in a way that makes it discoverable. We propose exploiting Semantic Web technologies and best practices to make metadata both discoverable and easy to publish. We share experiences in curating metadata to illustrate the cumbersome nature of data reuse in the current research environment. We also make recommendations with a real-world example of how data publishers can provide their metadata by adding limited additional markup to HTML pages on the Web. With little additional effort from data publishers, the difficulty of data discovery, access, and sharing can be greatly reduced and the impact of research data greatly enhanced.

[1]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[2]  Deborah L. McGuinness,et al.  Ontology-supported scientific data frameworks: The Virtual Solar-Terrestrial Observatory experience , 2009, Comput. Geosci..

[3]  Thomas A. Finholt,et al.  NEESGRID: A DISTRIBUTED COLLABORATORY FOR ADVANCED EARTHQUAKE ENGINEERING EXPERIMENT AND SIMULATION , 2004 .

[4]  Gordon Bell,et al.  Beyond the Data Deluge , 2009, Science.

[5]  J. Holdren Memorandum for the Heads of Executive Departments and Agencies: Increasing Access to the Results of Federally Funded Scientific Research , 2013 .

[6]  Martin J. Dürst,et al.  Internationalized Resource Identifiers (IRIs) , 2005, RFC.

[7]  Arie Shoshani,et al.  The Earth System Grid: Supporting the Next Generation of Climate Modeling Research , 2005, Proceedings of the IEEE.

[8]  Ivan Herman,et al.  HTML+RDFa 1.1, W3C Recommendation , 2013 .

[9]  Tom Heath,et al.  Linked Data: Evolving the Web into a Global Data Space , 2011, Linked Data.

[10]  Peter R. Orszag Memorandum for the Heads of Executive Departments and Agencies, and Independent Regulatory Agencies: Guidance for Implementing E.O. 13175 , 2011 .

[11]  Elise Y. Wong Schema.org , 2015 .

[12]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[13]  George Shu Heng Pau,et al.  Akuna - Integrated Toolsets Supporting Advanced Subsurface Flow and Transport Simulations for Environmental Management , 2012 .

[14]  B. Obama,et al.  Office of the Press Secretary , 2009 .

[15]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[16]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[17]  C. Tenopir,et al.  Data Sharing by Scientists: Practices and Perceptions , 2011, PloS one.

[18]  Christian Bizer,et al.  Evolving the Web into a Global Data Space , 2011, BNCOD.

[19]  Anthony J. G. Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery [Point of View] , 2011 .

[20]  Jim Gray,et al.  2020 Computing: Science in an exponential world , 2006, Nature.

[21]  Alistair Miles,et al.  SKOS: Simple Knowledge Organisation for the Web , 2007 .

[22]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.