The wider concept of data sharing: view from the BMJ.

The BMJ is now asking research authors to include a data sharing statement to explain which additional data from their study, if any, are available, to whom, and how. These may range from supplementary material to the complete data set—and may be made available only on request, accessible online with a password, or openly accessible to all on the web with a link on bmj.com —perhaps after a clearly stated period of personal use (Groves, 2009). The response has been slow, with only a handful so far providing greater access to the data underpinning BMJ papers. For instance, the meta-analysis of Law and others (2009) of 147 randomized trials of antihypertensive drugs to prevent cardiovascular disease ends with this statement: “Data sharing: An audit trail of the forest plots and related data is available at www.wolfson.qmul.ac.uk/bptrial,” and for the case control study of Garcia-Garcia and others (2009) on partial protection with seasonal trivalent inactivated vaccine against influenza A/H1N1 in Mexico City, the statement says “Data sharing: The technical appendix, statistical code, and dataset are available from jvaldespinog@birmex.gob.mx.” It is a start, and even having the negative statement “no additional data available” at the end of a paper sends a clear signal that the journal wants authors to consider and embrace the concept. We will continue to debate and campaign on this. The BMJ is emulating Annals of Internal Medicine reproducible research statements (Laine and others, 2007) in turn inspired by the policy of American Journal of Epidemiology (Peng and others, 2006). But we are not interested in data sharing only as a means to evaluate scientific claims or guard against misconduct (including failure to disclose financial interests; Allison, 2009). Data sharing after publication of the top line results is a natural and important extension of the BMJ’s philosophy of open access and transparency (scientific, ethical, and financial) in reporting medical research, and we believe that it has great potential to increase the understanding and use of information that is largely funded with public money. There are challenges, of course. Sharing raises important questions about data ownership (Delamothe, 1996) and permission for data release; technical issues of data storage, management, compatibility, archiving, access, and mining; and concerns about who should have access and when, and what limits may be needed to prevent misuse and mishandling of data—a caveat that Keiding (2010) rightly points out in this issue. Moreover, researchers lack incentives to analyze all the data that they generate, to manage data after funded projects have ended, and to share data other than informally with certain collaborators. And for medical researchers, sharing of clinical research data is a particularly tricky concept because the combination of clinical and personal data and the geographical location of a study can be enough to reveal a research participant’s identity. Hence, clinical research data need to be anonymized carefully before sharing and, in future, patients should be asked up front for consent to data sharing as well as consent to take part in the research (Hrynaszkiewicz and Altman, 2009).