Computer Methods and Programs in Biomedicine

BACKGROUND AND OBJECTIVE High-throughput measurement technologies have triggered a rise in large-scale cancer studies containing multiple levels of molecular data. While there are a number of efficient methods to analyze individual data types, there are far less that enhance data interpretation after analysis. We present the R package Director, a dynamic visualization approach to linking and interrogating multiple levels of molecular data after analysis for clinically meaningful, actionable insights. METHODS Sankey diagrams are traditionally used to represent quantitative flows through multiple, distinct events. Regulation can be interpreted as a flow of biological information through a series of molecular interactions. Functions in Director introduce novel drawing capabilities to make Sankey diagrams robust to a wide range of quantitative measures and to depict molecular interactions as regulatory cascades. The package streamlines creation of diagrams using as input quantitative measurements identifying nodes as molecules of interest and paths as the interaction strength between two molecules. RESULTS Director's utility is demonstrated with quantitative measurements of candidate microRNA-gene networks identified in an ovarian cancer dataset. A recent study reported eight miRNAs as master regulators of signature genes in epithelial-mesenchymal transition (EMT). The Sankey diagrams generated with data from this study furthers interpretation of the miRNAs' roles by revealing potential co-regulatory behavior in the extracellular matrix (ECM). An additional analysis identified 32 genes differentially expressed between good and poor prognosis patients in four significant pathways (FDR  ≤  0.1), three of which support a complementary role of the ECM in ovarian cancer. The resulting diagram created with Director suggest elevated levels of COL11A1, INHBA, and THBS2 - a signature feature of metastasis [1] - and decreased levels of their targeting miRNAs define poor prognosis. CONCLUSION We have demonstrated a visualization approach suitable for implementation in an analysis workflow, linking multiple levels of molecular data to gain novel perspective on candidate biomarkers in a complex disease. The diagrams are dynamic, easily replicable, and rendered locally as HTML files to facilitate sharing. The R package Director is simple to use and widely available on all operating systems through Bioconductor (http://bioconductor.org/packages/Director) and GitHub (http://kzouchka.github.io/Director).

[1]  Kwan-Liu Ma,et al.  A novel tool for visualizing chronic kidney disease associated polymorbidity: a 13-year cohort study in Taiwan , 2015, J. Am. Medical Informatics Assoc..

[2]  Scott Chamberlain,et al.  Create Interactive Web Graphics via Plotly's JavaScript GraphingLibrary , 2015 .

[3]  Christopher Gandrud,et al.  D3 JavaScript Network Graphs from R , 2015 .

[4]  L. Lim,et al.  MicroRNA targeting specificity in mammals: determinants beyond seed pairing. , 2007, Molecular cell.

[5]  B. Karlan,et al.  A Collagen-Remodeling Gene Signature Regulated by TGF-β Signaling Is Associated with Metastasis and Poor Survival in Serous Ovarian Cancer , 2013, Clinical Cancer Research.

[6]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[7]  Aleix Prat Aparicio Comprehensive molecular portraits of human breast tumours , 2012 .

[8]  Mihaela Zavolan,et al.  Quantifying the strength of miRNA-target interactions. , 2015, Methods.

[9]  S. Knox From 'omics' to complex disease: a systems biology approach to gene-environment interactions in cancer , 2010, Cancer Cell International.

[10]  Il-man Kim,et al.  Regulation of Metastasis by microRNAs in Ovarian Cancer , 2014, Front. Oncol..

[11]  J. Watkinson,et al.  Multi-cancer computational analysis reveals invasion-associated variant of desmoplastic reaction involving INHBA, THBS2 and COL11A1 , 2010, BMC Medical Genomics.

[12]  Benjamin Haibe-Kains,et al.  Quantitative assessment and validation of network inference methods in bioinformatics , 2014, Front. Genet..

[13]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[14]  Daniel A. Keim,et al.  Information Visualization and Visual Data Mining , 2002, IEEE Trans. Vis. Comput. Graph..

[15]  J. McPherson,et al.  Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[16]  J. Uhm Comprehensive genomic characterization defines human glioblastoma genes and core pathways , 2009 .

[17]  Robert S Mannel,et al.  Phase II evaluation of pemetrexed in the treatment of recurrent or persistent platinum-resistant ovarian or primary peritoneal carcinoma: a study of the Gynecologic Oncology Group. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[18]  Xia Li,et al.  Mir-509-5p joins the Mdm2/p53 feedback loop and regulates cancer cell growth , 2014, Cell Death and Disease.

[19]  Raphael Gottardo,et al.  Orchestrating high-throughput genomic analysis with Bioconductor , 2015, Nature Methods.

[20]  Jeffrey Heer,et al.  D³ Data-Driven Documents , 2011, IEEE Transactions on Visualization and Computer Graphics.

[21]  Teresa M. Przytycka,et al.  Chapter 5: Network Biology Approach to Complex Diseases , 2012, PLoS Comput. Biol..

[22]  Bernhard O. Palsson,et al.  Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways , 2015, PLoS Comput. Biol..

[23]  Mario Schmidt,et al.  The Sankey Diagram in Energy and Material Flow Management , 2008 .

[24]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[25]  Burton B. Yang,et al.  miRNAs regulate expression and function of extracellular matrix molecules. , 2013, Matrix biology : journal of the International Society for Matrix Biology.

[26]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[27]  Héctor Corrada Bravo,et al.  Epiviz: interactive visual analytics for functional genomics data , 2014, Nature Methods.

[28]  Kwan-Liu Ma,et al.  A richly interactive exploratory data analysis and visualization tool using electronic medical records , 2015, BMC Medical Informatics and Decision Making.

[29]  Sheila M. Reynolds,et al.  Integrated analyses identify a master microRNA regulatory network for the mesenchymal subtype in serous ovarian cancer. , 2013, Cancer cell.

[30]  V. Marx Visualizing epigenomic data , 2015, Nature Methods.