StructuRly: A novel shiny app to produce comprehensive, detailed and interactive plots for population genetic analysis

Population genetics focuses on the analysis of genetic differences within and between-group of individuals and the inference of the populations’ structure. These analyses are usually carried out using Bayesian clustering or maximum likelihood estimation algorithms that assign individuals to a given population depending on specific genetic patterns. Although several tools were developed to perform population genetics analysis, their standard graphical outputs may not be sufficiently informative for users lacking interactivity and complete information. StructuRly aims to resolve this problem by offering a complete environment for population analysis. In particular, StructuRly combines the statistical power of the R language with the friendly interfaces implemented using the shiny libraries to provide a novel tool for performing population clustering, evaluating several genetic indexes, and comparing results. Moreover, graphical representations are interactive and can be easily personalized. StructuRly is available either as R package on GitHub, with detailed information for its installation and use and as shinyapps.io servers for those users who are not familiar with R and the RStudio IDE. The application has been tested on Linux, macOS and Windows operative systems and can be launched as a shiny app in every web browser.

[1]  T. Ganino,et al.  Olive oil traceability by means of chemical and sensory analyses: A comparison with SSR biomolecular profiles , 2011 .

[2]  S. Camposeo,et al.  GBS-derived SNP catalogue unveiled wide genetic variability and geographical relationships of Italian olive cultivars , 2018, Scientific Reports.

[3]  A. Yuan,et al.  Exact test of Hardy-Weinberg equilibrium by Markov chain Monte Carlo. , 2003, Mathematical medicine and biology : a journal of the IMA.

[4]  S. Savel'ev,et al.  Renninger’s Gedankenexperiment, the collapse of the wave function in a rigid quantum metamaterial and the reality of the quantum state vector , 2018, Scientific Reports.

[5]  Ka Yee Yeung,et al.  Details of the Adjusted Rand index and Clustering algorithms Supplement to the paper “ An empirical study on Principal Component Analysis for clustering gene expression data ” ( to appear in Bioinformatics ) , 2001 .

[6]  P. Schraml,et al.  Tissue lithography: Microscale dewaxing to enable retrospective studies on formalin-fixed paraffin-embedded (FFPE) tissue sections , 2017, PloS one.

[7]  David H. Alexander,et al.  Fast model-based estimation of ancestry in unrelated individuals. , 2009, Genome research.

[8]  C. Angelini,et al.  High Biodiversity Arises from the Analyses of Morphometric, Biochemical and Genetic Data in Ancient Olive Trees of South of Italy , 2019, Plants.

[9]  Carlos R Reis,et al.  Erratum to: the ER stress inducer DMC enhances TRAIL-induced apoptosis in glioblastoma , 2014, SpringerPlus.

[10]  J. Taylor,et al.  Genotypic diversity: estimation and prediction in samples. , 1988, Genetics.

[11]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[12]  Rod Peakall,et al.  GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research—an update , 2012, Bioinform..

[13]  L. Neri,et al.  Autochthonous cultivars of Emilia Romagna region and their clones: Comparison of the chemical and sensory properties of olive oils. , 2017, Food chemistry.

[14]  Sohini Ramachandran,et al.  pong: fast analysis and visualization of latent clusters in population genetic data , 2015, bioRxiv.

[15]  F. V. van Eeuwijk,et al.  Determination of genetic structure of germplasm collections: are traditional hierarchical clustering methods appropriate for molecular marker data? , 2011, Theoretical and Applied Genetics.

[16]  Arlin Stoltzfus,et al.  Modeling Evolution Using the Probability of Fixation: History and Implications , 2014, The Quarterly Review of Biology.

[17]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[18]  E. H. Simpson Measurement of Diversity , 1949, Nature.

[19]  James R. Hennessy,et al.  shinyheatmap: Ultra fast low memory heatmap web interface for big data genomics , 2017, bioRxiv.

[20]  C. Duarte,et al.  Standardizing methods to address clonality in population studies , 2007, Molecular ecology.

[21]  Ramesh Krishnan Ramasamy,et al.  STRUCTURE PLOT: a program for drawing elegant STRUCTURE bar plots in user friendly interface , 2014, SpringerPlus.

[22]  Olivier François,et al.  Bayesian clustering algorithms ascertaining spatial population structure: a new computer program and a comparison study , 2007 .

[23]  G. Evanno,et al.  Detecting the number of clusters of individuals using the software structure: a simulation study , 2005, Molecular ecology.

[24]  Hongbao Cao,et al.  Mapsnp: An R Package to Plot a Genomic Map for Single Nucleotide Polymorphisms , 2015, PloS one.

[25]  Christopher Phillips,et al.  An overview of STRUCTURE: applications, parameter settings, and supporting software , 2013, Front. Genet..

[26]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[27]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[28]  M. Cugmas,et al.  On comparing partitions , 2015 .

[29]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[30]  Camille Roth,et al.  Natural Scales in Geographical Patterns , 2017, Scientific Reports.

[31]  S. Castiglione,et al.  Oil composition and genetic biodiversity of ancient and new olive (Olea europea L.) varieties and accessions of southern Italy. , 2013, Plant science : an international journal of experimental plant biology.

[32]  M. Crawford,et al.  Autosomal STR Variation in a Basque Population: Vizcaya Province , 2006, Human biology.

[33]  W. Bossert,et al.  The Measurement of Diversity , 2001 .

[34]  M. Jakobsson,et al.  Clumpak: a program for identifying clustering modes and packaging population structure inferences across K , 2015, Molecular ecology resources.

[35]  Laura D Hughes,et al.  ExpressionDB: An open source platform for distributing genome-scale datasets , 2017, PloS one.