As data collections become established in key disciplines, some of the longstanding barriers to data sharing become to dissolve; yet others remain. While metadata and ontologies help overcome the problems of finding and interpreting data, the lack of clarity over licensing remains a real impediment to data reuse. Freedom from legal restriction and uncertainty is essential for the effective sharing, combining and deriving of data from these distributed collections. Reuse and recombination of data will be greatly facilitated by expanding the definition of the semantic web to include the semantics of data licensing. We aim to express licensing terms in a computable manner, within the context of research practice, enabling us to infer the resulting state of rights, obligations and conditions that are inherited by derived and recombined datasets, using a mixed bag of licenses. Building off this we aim to simulate the effects of varying licensing practices within communities, proposing a measure of health of our scholarly record based on compatibility and restrictiveness of the licenses contained therein.
[1]
Anthony J. G. Hey,et al.
The Fourth Paradigm: Data-Intensive Scientific Discovery [Point of View]
,
2011
.
[2]
Ben Adida,et al.
10. CC REL: The Creative Commons Rights Expression Language
,
2012
.
[3]
S. Ricketson,et al.
The Berne Convention for the Protection of Literary and Artistic Works : 1886-1986
,
1987
.
[4]
John Wilbanks,et al.
Science, Open Communication and Sustainable Development
,
2010
.
[5]
Roberto García González.
A semantic web approach to digital rights management
,
2006
.
[6]
Tony Hey,et al.
The Fourth Paradigm: Data-Intensive Scientific Discovery
,
2009
.
[7]
John Wilbanks.
Public domain, copyright licenses and the freedom to integrate science
,
2008
.
[8]
Jérôme Euzenat,et al.
Ontology Matching: State of the Art and Future Challenges
,
2013,
IEEE Transactions on Knowledge and Data Engineering.