DataUp: Further Development and Community Building

SUMMARY OF PROPOSED WORK DataUp: Further Development and Community Building Project Summary: Scientific data are increasingly born digital or are digitized early in the research process. Despite rapid growth in digital data, researchers rarely receive instruction in good data management practices. As a result, they often patch together idiosyncratic systems for data organization and documentation, which can have a negative impact on the long-term usability of their data. There is a rising demand for formalized data stewardship practices since funders and journals are now encouraging or mandating data management plans, data sharing, or both. Recognizing that most Earth, environmental, and ecological scientists use spreadsheets at some point in the life cycles of their data, the California Digital Library (CDL) and its partners created a tool for Microsoft Excel that would encourage and enable good data stewardship practices. The result was a tool, DataUp, which facilitates documenting, managing, and archiving tabular scientific data. Since its launch in fall 2012, response to DataUp has been enthusiastic. The CDL has received inquiries about DataUp from many repositories, organizations, and publishers interested in configuring the tool for their needs. These groups are most interested in customizing the DataUp tool for their user communities, which requires well-documented code, a familiar development platform, and an open API with adequate information for developers to use it. The CDL would like to see DataUp development move forward in order to capitalize on the potential opportunities for DataONE and beyond. We are proposing a 12-month project to further develop the DataUp tool. The tool is useful as-is, but it has not reached its full potential as a tool for facilitating data management, sharing and archiving for researchers across disciplines. DataUp has the potential to become a key tool in research data sharing and archiving as envisioned by the NSF DataNet program. To that end, our major project goals for DataUp are to (1) enhance the tool?s user experience and add features, and (2) build the open-source community around DataUp. We plan to make improvements to DataUp via an iterative development process with community feedback and input. This community will include the existing DataUp user community, as well as researchers and information professionals from the University of California and DataONE. Intellectual merit: DataUp will have a transforming effect on protecting the global scholarly community’s investment in the long tail of research data. Much of this data is recorded in spreadsheets and produced in disciplines that have no organized approach to sharing and archiving, and have limited resource to do so (including citizen science groups). DataUp is the first tool that demonstrates such promise, and as the group that envisioned and built the original tool, the California Digital Library is uniquely qualified to fulfill that promise. Also, working with DataONE provides the best path forward to having a positive impact in the DataNet community. Broader impacts: DataUp’s repository- and discipline-agnostic design fosters an impact far beyond the Earth, environmental, and ecological sciences. Advancing data management and archiving practices in all disciplines will result in a more open scientific process, with readily available datasets that facilitate the progress of research. This has immeasurable benefits for society at large.