Bipartite Graphs for Visualization Analysis of Microbiome Data

Visualization analysis plays an important role in metagenomics research. Proper and clear visualization can help researchers get their first insights into data and by selecting different features, also revealing and highlighting hidden relationships and drawing conclusions. To prevent the resulting presentations from becoming chaotic, visualization techniques have to properly tackle the high dimensionality of microbiome data. Although a number of different methods based on dimensionality reduction, correlations, Venn diagrams, and network representations have already been published, there is still room for further improvement, especially in the techniques that allow visual comparison of several environments or developmental stages in one environment. In this article, we represent microbiome data by bipartite graphs, where one partition stands for taxa and the other stands for samples. We demonstrated that community detection is independent of taxonomical level. Moreover, focusing on higher taxonomical levels and the appropriate merging of samples greatly helps improving graph organization and makes our presentations clearer than other graph and network visualizations. Capturing labels in the vertices also brings the possibility of clearly comparing two or more microbial communities by showing their common and unique parts.

[1]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[2]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[3]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[4]  Jizhong Zhou,et al.  Reproducibility and quantitation of amplicon sequencing-based detection , 2011, The ISME Journal.

[5]  Karel Sedlar,et al.  Succession and Replacement of Bacterial Populations in the Caecum of Egg Laying Hens over Their Whole Life , 2014, PloS one.

[6]  Leland Wilkinson,et al.  The History of the Cluster Heat Map , 2009 .

[7]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[8]  Helena Skutkova,et al.  Bipartite graphs for metagenomic data analysis and visualization , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[9]  R. Daniel,et al.  Metagenomic Analyses: Past and Future Trends , 2010, Applied and Environmental Microbiology.

[10]  R. Knight,et al.  Fast UniFrac: Facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data , 2009, The ISME Journal.

[11]  J. Chun,et al.  Introducing EzTaxon-e: a prokaryotic 16S rRNA gene sequence database with phylotypes that represent uncultured species. , 2012, International journal of systematic and evolutionary microbiology.

[12]  J. Raes,et al.  Microbial interactions: from networks to models , 2012, Nature Reviews Microbiology.

[13]  A. Klindworth,et al.  Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies , 2012, Nucleic acids research.

[14]  I. Nookaew,et al.  Insights from 20 years of bacterial genome sequencing , 2015, Functional & Integrative Genomics.

[15]  Levent Albayrak,et al.  CoCo: An application to store High-Throughput Sequencing data in compact text and binary file formats , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[16]  R. Knight,et al.  Advancing analytical algorithms and pipelines for billions of microbial sequences. , 2012, Current opinion in biotechnology.

[17]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[18]  Karel Sedlar,et al.  Characterization of Egg Laying Hen and Broiler Fecal Microbiota in Poultry Farms in Croatia, Czech Republic, Hungary and Slovenia , 2014, PloS one.

[19]  P. Chain,et al.  Next generation sequencing and bioinformatic bottlenecks: the current state of metagenomic data analysis. , 2012, Current opinion in biotechnology.

[20]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[21]  M. Jacomy,et al.  ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software , 2014, PloS one.

[22]  Peter Salamon,et al.  Metagenomic and Small-Subunit rRNA Analyses Reveal the Genetic Diversity of Bacteria, Archaea, Fungi, and Viruses in Soil , 2007, Applied and Environmental Microbiology.

[23]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[24]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[25]  Alexander F. Auch,et al.  MEGAN analysis of metagenomic data. , 2007, Genome research.

[26]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[27]  Donald R. Forsdyke,et al.  Evolutionary Bioinformatics , 2016, Springer International Publishing.

[28]  Catherine Putonti,et al.  HAsh-MaP-ERadicator: Filtering non-target sequences from next generation sequencing reads , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[29]  Jean-Charles Delvenne,et al.  Random Walks, Markov Processes and the Multiscale Modular Organization of Complex Networks , 2014, IEEE Transactions on Network Science and Engineering.