Visualizing ’omic feature rankings and log-ratios using Qurro

Many tools for dealing with compositional “’omics” data produce feature-wise values that can be ranked in order to describe features’ associations with some sort of variation. These values include differentials (which describe features’ associations with specified covariates) and feature loadings (which describe features’ associations with variation along a given axis in a biplot). Although prior work has discussed the use of these “rankings” as a starting point for exploring the log-ratios of particularly high-or low-ranked features, such exploratory analyses have previously been done using custom code to visualize feature rankings and the log-ratios of interest. This approach is laborious, prone to errors, and raises questions about reproducibility. To address these problems we introduce Qurro, a tool that interactively visualizes a plot of feature rankings (a “rank plot”) alongside a plot of selected features’ log-ratios within samples (a “sample plot”). Qurro’s interface includes various controls that allow users to select features from along the rank plot to compute a log-ratio; this action updates both the rank plot (through highlighting selected features) and the sample plot (through displaying the current log-ratios of samples). Here we demonstrate how this unique interface helps users explore feature rankings and log-ratios simply and effectively.

[1]  Ryan Hendrickson,et al.  KatharoSeq Enables High-Throughput Microbiome Analysis from Low-Biomass Samples , 2018, mSystems.

[2]  Mingxun Wang,et al.  Qiita: rapid, web-enabled microbiome meta-analysis , 2018, Nature Methods.

[3]  et al.,et al.  Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.

[4]  Jean M. Macklaim,et al.  Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis , 2014, Microbiome.

[5]  Francesco Asnicar,et al.  Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2 , 2019, Nature Biotechnology.

[6]  James T. Morton,et al.  Deep metagenomics examines the oral microbiome during dental caries, revealing novel taxa and co-occurrences with host molecules , 2019, bioRxiv.

[7]  James T. Morton,et al.  Establishing microbial composition measurement standards with reference frames , 2019, Nature Communications.

[8]  Pelin Yilmaz,et al.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools , 2012, Nucleic Acids Res..

[9]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[10]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[11]  Francis Tuerlinckx,et al.  Increasing Transparency Through a Multiverse Analysis , 2016, Perspectives on psychological science : a journal of the Association for Psychological Science.

[12]  Arvind Satyanarayan,et al.  Reactive Vega: A Streaming Dataflow Architecture for Declarative Interactive Visualization , 2016, IEEE Transactions on Visualization and Computer Graphics.

[13]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[14]  R. Paredes,et al.  Balances: a New Perspective for Microbiome Analysis , 2017, mSystems.

[15]  F. McCoy,et al.  Janus-faced PIDD: a sensor for DNA damage-induced cell death or survival? , 2012, Molecular cell.

[16]  Karsten Zengler,et al.  A Novel Sparse Compositional Technique Reveals Microbial Perturbations , 2019, mSystems.

[17]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[18]  M. Besson,et al.  The Gills of Reef Fish Support a Distinct Microbiome Influenced by Host-Specific Factors , 2018, Applied and Environmental Microbiology.

[19]  R. Parsons,et al.  Minor revision to V4 region SSU rRNA 806R gene primer greatly increases detection of SAR11 bacterioplankton , 2015 .

[20]  Andreas Wilke,et al.  The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome , 2012, GigaScience.

[21]  Arvind Satyanarayan,et al.  Altair: Interactive Statistical Visualizations for Python , 2018, J. Open Source Softw..

[22]  Arvind Satyanarayan,et al.  Vega-Lite: A Grammar of Interactive Graphics , 2018, IEEE Transactions on Visualization and Computer Graphics.

[23]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[24]  R. Knight,et al.  Temporal, Environmental, and Biological Drivers of the Mucosal Microbiome in a Wild Marine Fish, Scomber japonicus , 2019, mSphere.

[25]  Jose A Navas-Molina,et al.  Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns , 2017, mSystems.

[26]  J. Aitchison,et al.  Biplots of Compositional Data , 2002 .

[27]  R. Knight,et al.  Temporal, Environmental, and Biological Drivers of the Mucosal Microbiome in a Wild Marine Fish, Scomber japonicus , 2019, Msphere.

[28]  I. Paulsen,et al.  Ecological Genomics of Marine Picocyanobacteria , 2009, Microbiology and Molecular Biology Reviews.

[29]  Benjamin D. Kaehler,et al.  Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin , 2018, Microbiome.

[30]  Richard A. Becker,et al.  Brushing scatterplots , 1987 .

[31]  J. Fuhrman,et al.  Every base matters: assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples. , 2016, Environmental microbiology.

[32]  Jean M. Macklaim,et al.  Microbiome Datasets Are Compositional: And This Is Not Optional , 2017, Front. Microbiol..