Construction and Visualization of Dynamic Biological Networks: Benchmarking the Neo4J Graph Database

Genome analysis is a major precondition for future advances in the life sciences. The complex organization of genome data and the interactions between genomic components can often be modeled and visualized in graph structures. In this paper we propose the integration of several data sets into a graph database. We study the aptness of the database system in terms of analysis and visualization of a genome regulatory network (GRN) by running a benchmark on it. Major advantages of using a database system are the modifiability of the data set, the immediate visualization of query results as well as built-in indexing and caching features.

[1]  Eric H Davidson,et al.  Visualization, documentation, analysis, and communication of large-scale gene regulatory networks. , 2009, Biochimica et biophysica acta.

[2]  Z. Weng,et al.  Functional analysis of transcription factor binding sites in human promoters , 2012, Genome Biology.

[3]  Michael Gleicher,et al.  Sequence Surveyor: Leveraging Overview for Scalable Genomic Alignment Visualization , 2011, IEEE Transactions on Visualization and Computer Graphics.

[4]  Hailin Chen,et al.  STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data , 2009, BMC Bioinformatics.

[5]  Nadav Ahituv,et al.  Minor Loops in Major Folds: Enhancer–Promoter Looping, Chromatin Restructuring, and Their Association with Transcriptional Regulation and Disease , 2015, PLoS genetics.

[6]  Eugenio Cesario,et al.  Big Data Analysis for Smart City Applications , 2019, Encyclopedia of Big Data Technologies.

[7]  M. Sheelagh T. Carpendale,et al.  GeneVis: visualization tools for genetic regulatory network dynamics , 2002, IEEE Visualization, 2002. VIS 2002..

[8]  Lars Juhl Jensen,et al.  Are graph databases ready for bioinformatics? , 2013, Bioinform..

[9]  Alexander E. Kel,et al.  TRANSCompel®: a database on composite regulatory elements in eukaryotic genes , 2002, Nucleic Acids Res..

[10]  Edgar Wingender,et al.  PC-TraFF: identification of potentially collaborating transcription factors using pointwise mutual information , 2015, BMC Bioinformatics.

[11]  Maria Jesus Martin,et al.  BioJS: an open source JavaScript framework for biological data visualization , 2013, Bioinform..

[12]  Geir Kjetil Sandve,et al.  In the loop: promoter–enhancer interactions and bioinformatics , 2015, Briefings Bioinform..

[13]  Kathleen Marchal,et al.  SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms , 2006, BMC Bioinformatics.

[14]  H. Bussemaker,et al.  In search of the determinants of enhancer-promoter interaction specificity. , 2014, Trends in cell biology.

[15]  Ugur Sahin,et al.  RNA-Seq Atlas - a reference database for gene expression profiling in normal tissue by next-generation sequencing , 2012, Bioinform..

[16]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[17]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[18]  Jörg D. Becker,et al.  LegumeGRN: A Gene Regulatory Network Prediction Server for Functional and Comparative Studies , 2013, PloS one.

[19]  Abhijeet R. Sonawane,et al.  Understanding Tissue-Specific Gene Regulation , 2017, bioRxiv.

[20]  Lizhe Wang,et al.  Data Visualization in Bioinformatics , 2012 .

[21]  Jugal K. Kalita,et al.  Reconstruction of gene co-expression network from microarray data using local expression patterns , 2014, BMC Bioinformatics.

[22]  Nuno A. Fonseca,et al.  Expression Atlas update—an integrated database of gene and protein expression in humans, animals and plants , 2015, Nucleic Acids Res..

[23]  Lena Wiese,et al.  Big Data Technologies for DNA Sequencing , 2019, Encyclopedia of Big Data Technologies.

[24]  Tanya M. Teslovich,et al.  The Influence of Age and Sex on Genetic Associations with Adult Body Size and Shape: A Large-Scale Genome-Wide Interaction Study , 2015, PLoS Genetics.

[25]  Aiguo Li,et al.  FastMEDUSA: a parallelized tool to infer gene regulatory networks , 2010, Bioinform..

[26]  Swarup Roy,et al.  Tools for in-Silico Reconstruction and Visualization of Gene Regulatory Networks (GRN) , 2015, 2015 Second International Conference on Advances in Computing and Communication Engineering.

[27]  Bang Wong,et al.  Visualizing biological data—now and in the future , 2010, Nature Methods.

[28]  Peter W. Kirlew Life Science Data Repositories in the Publications of Scientists and Librarians , 2011 .

[29]  Andreas Kerren,et al.  BioVis Explorer: A visual guide for biological data visualization techniques , 2017, PloS one.

[30]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[31]  Derek W Wright,et al.  Gateways to the FANTOM5 promoter level mammalian expression atlas , 2015, Genome Biology.

[32]  Matthias Dehmer,et al.  NetBioV: an R package for visualizing large network data in biology and medicine , 2014, Bioinform..

[33]  Kara Dolinski,et al.  The BioGRID interaction database: 2015 update , 2014, Nucleic Acids Res..

[34]  Dario Floreano,et al.  GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods , 2011, Bioinform..