Enhancing reproducibility and collaboration via management of R package cohorts
暂无分享,去创建一个
Science depends on collaboration, result reproduction, and the development of supporting software tools. Each of these requires careful management of software versions. We present a unified model for installing, managing, and publishing software contexts in R. It introduces the package manifest as a central data structure for representing version specific, decentralized package cohorts. The manifest points to package sources on arbitrary hosts and in various forms, including tarballs and directories under version control. We provide a high-level interface for creating and switching between side-by-side package libraries derived from manifests. Finally, we extend package installation to support the retrieval of exact package versions as indicated by manifests, and to maintain provenance for installed packages. The provenance information enables the user to publish libraries or sessions as manifests, hence completing the loop between publication and deployment. We have implemented this model across two software packages, switchr and GRANbase, and have released the source code under the Artistic 2.0 license.
[1] Bill Howe,et al. Virtual Appliances, Cloud Computing, and Reproducible Research , 2012, Computing in Science & Engineering.
[2] W. Huber,et al. Differential expression analysis for sequence count data , 2010 .
[3] Pei Qi Kek. Docker : build, ship, and run any app, anywhere , 2017 .
[4] Robert Gentleman,et al. Statistical Analyses and Reproducible Research , 2007 .
[5] Jeroen Ooms,et al. Possible Directions for Improving Dependency Versioning in R , 2013, R J..