Mayday - integrative analytics for expression data

BackgroundDNA Microarrays have become the standard method for large scale analyses of gene expression and epigenomics. The increasing complexity and inherent noisiness of the generated data makes visual data exploration ever more important. Fast deployment of new methods as well as a combination of predefined, easy to apply methods with programmer's access to the data are important requirements for any analysis framework. Mayday is an open source platform with emphasis on visual data exploration and analysis. Many built-in methods for clustering, machine learning and classification are provided for dissecting complex datasets. Plugins can easily be written to extend Mayday's functionality in a large number of ways. As Java program, Mayday is platform-independent and can be used as Java WebStart application without any installation. Mayday can import data from several file formats, database connectivity is included for efficient data organization. Numerous interactive visualization tools, including box plots, profile plots, principal component plots and a heatmap are available, can be enhanced with metadata and exported as publication quality vector files.ResultsWe have rewritten large parts of Mayday's core to make it more efficient and ready for future developments. Among the large number of new plugins are an automated processing framework, dynamic filtering, new and efficient clustering methods, a machine learning module and database connectivity. Extensive manual data analysis can be done using an inbuilt R terminal and an integrated SQL querying interface. Our visualization framework has become more powerful, new plot types have been added and existing plots improved.ConclusionsWe present a major extension of Mayday, a very versatile open-source framework for efficient micro array data analysis designed for biologists and bioinformaticians. Most everyday tasks are already covered. The large number of available plugins as well as the extension possibilities using compiled plugins and ad-hoc scripting allow for the rapid adaption of Mayday also to very specialized data exploration. Mayday is available at http://microarray-analysis.org.

[1]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[2]  Rainer Breitling,et al.  Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments , 2004, FEBS letters.

[3]  Koji Kadota,et al.  A weighted average difference method for detecting differentially expressed genes from microarray data , 2008, Algorithms for Molecular Biology.

[4]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[5]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[6]  Lennart Martens,et al.  The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases , 2007, BMC Bioinformatics.

[7]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[8]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[9]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[10]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[11]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[12]  Kay Nieselt,et al.  A Framework for Visualization of Microarray Data and Integrated Meta Information , 2005, Inf. Vis..

[13]  W. Liang,et al.  9) TM4 Microarray Software Suite , 2006 .

[14]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[15]  W. Liang,et al.  TM4 microarray software suite. , 2006, Methods in enzymology.

[16]  Kay Nieselt,et al.  The dynamic architecture of the metabolic switch in Streptomyces coelicolor , 2010, BMC Genomics.

[17]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[18]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[20]  Andreas Tauch,et al.  EMMA 2 – A MAGE-compliant system for the collaborative analysis and integration of microarray data , 2009, BMC Bioinformatics.

[21]  Thomas Mailund,et al.  Rapid Neighbour-Joining , 2008, WABI.

[22]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[23]  B. Barrell,et al.  Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2) , 2002, Nature.

[24]  J. Shendure,et al.  Advanced sequencing technologies: methods and goals , 2004, Nature Reviews Genetics.

[25]  Kay Nieselt,et al.  Mayday-a microarray data analysis workbench , 2006, Bioinform..

[26]  T. Mikkelsen,et al.  Genome-wide maps of chromatin state in pluripotent and lineage-committed cells , 2007, Nature.

[27]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[28]  Yingdong Zhao,et al.  Analysis of Gene Expression Data Using BRB-Array Tools , 2007, Cancer informatics.

[29]  Joaquín Dopazo,et al.  GEPAS, a web-based tool for microarray data analysis and interpretation , 2008, Nucleic Acids Res..