Summary: The move of computational genomics workflows to Cloud Computing platforms is associated with a new level of integration and interoperability that challenges existing data representation formats. The Variant Calling Format (VCF) is in a particularly sensitive position in that regard, with both clinical and consumer‐facing analysis tools relying on this self‐contained description of genomic variation in Next Generation Sequencing (NGS) results. In this report we identify an isomorphic map between VCF and the reference Resource Description Framework. RDF is advanced by the World Wide Web Consortium (W3C) to enable representations of linked data that are both distributed and discoverable. The resulting ability to decompose VCF reports of genomic variation without loss of context addresses the need to modularize and govern NGS pipelines for Precision Medicine. Specifically, it provides the flexibility (i.e. the indexing) needed to support the wide variety of clinical scenarios and patient‐facing governance where only part of the VCF data is fitting. Availability and Implementation: Software libraries with a claim to be both domain‐facing and consumer‐facing have to pass the test of portability across the variety of devices that those consumers in fact adopt. That is, ideally the implementation should itself take place within the space defined by web technologies. Consequently, the isomorphic mapping function was implemented in JavaScript, and was tested in a variety of environments and devices, client and server side alike. These range from web browsers in mobile phones to the most popular micro service platform, NodeJS. The code is publicly available at https://github.com/ibl/VCFr, with a live deployment at: http://ibl.github.io/VCFr/. Contact: jonas.almeida@stonybrookmedicine.edu
[1]
Gonçalo R. Abecasis,et al.
The variant call format and VCFtools
,
2011,
Bioinform..
[2]
Hagen Blankenburg,et al.
Integrating biological data – the Distributed Annotation System
,
2008,
BMC Bioinformatics.
[3]
Elizabeth M. Smigielski,et al.
dbSNP: the NCBI database of genetic variation
,
2001,
Nucleic Acids Res..
[4]
Patrick Ruch,et al.
Mapping proteins to disease terminologies: from UniProt to MeSH
,
2008,
BMC Bioinformatics.
[5]
Andrew M. Jenkinson,et al.
The EBI RDF platform: linked open data for the life sciences
,
2014,
Bioinform..