Bioinformatics-Driven Big Data Analytics in Microbial Research

With the advent of sophisticated and high-end molecular biological technologies, microbial research has observed tremendous boom. It has now become one of the most prominent sources for the generation of “big data.” This is made possible due to huge data coming from the experimental platforms like whole genome sequencing projects, microarray technologies, mapping of Single Nucleotide Polymorphisms (SNP), proteomics, metabolomics, and phenomics programs. For analysis, interpretation, comparison, storage, archival, and utilization of this wealth of information, bioinformatics has emerged as a massive platform to solve the problems of data management in microbial research. In present chapter, the authors present an account of “big data” resources spread across the microbial domain of research, the efforts that are being made to generate “big data,” computational resources facilitating analysis and interpretation, and future needs for huge biological data storage, interpretation, and management.

[1]  David J. Edwards,et al.  Beginner’s guide to comparative bacterial genome analysis using next-generation sequence data , 2013, Microbial Informatics and Experimentation.

[2]  Eve S. McCulloch Harnessing the Power of Big Data in Biological Research , 2013 .

[3]  Jens Nielsen,et al.  Metabolic footprinting in microbiology: methods and applications in functional genomics and biotechnology. , 2008, Trends in biotechnology.

[4]  Paulien Hogeweg,et al.  The Roots of Bioinformatics in Theoretical Biology , 2011, PLoS Comput. Biol..

[5]  Jingfa Xiao,et al.  Bioinformatics clouds for big data manipulation , 2012, Biology Direct.

[6]  Vivien Marx Genomics in the clouds , 2013, Nature Methods.

[7]  James E. Galagan,et al.  Genomics of the fungal kingdom: Insights into eukaryotic biology , 2005 .

[8]  K. Shinozaki,et al.  Advances in Omics and Bioinformatics Tools for Systems Analyses of Plant Functions , 2011, Plant & cell physiology.

[9]  Y Wang,et al.  Targeted metabolomics and mass spectrometry. , 2010, Advances in protein chemistry and structural biology.

[10]  Weiwen Zhang,et al.  Integrating multiple 'omics' analysis for microbial biology: application and methodologies. , 2010, Microbiology.

[11]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[12]  Oliver Fiehn,et al.  Combining Genomics, Metabolome Analysis, and Biochemical Modelling to Understand Metabolic Networks , 2001, Comparative and functional genomics.

[13]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[14]  A. Demain,et al.  Recombinant organisms for production of industrial products , 2010, Bioengineered bugs.

[15]  Susan M. Huse,et al.  Microbial diversity in the deep sea and the underexplored “rare biosphere” , 2006, Proceedings of the National Academy of Sciences.

[16]  K. Zengler,et al.  Tapping into microbial diversity , 2004, Nature Reviews Microbiology.

[17]  N. Kyrpides Fifteen years of microbial genomics: meeting the challenges and fulfilling the dream , 2009, Nature Biotechnology.

[18]  S. Giovannoni,et al.  The uncultured microbial majority. , 2003, Annual review of microbiology.

[19]  Geert Potters,et al.  Systems biology of the cell , 2010 .

[20]  A. Bansal,et al.  Bioinformatics in microbial biotechnology – a mini review , 2005 .

[21]  J R Yates,et al.  Analysis of the microbial proteome. , 2000, Current opinion in microbiology.

[22]  David Edwards,et al.  Plant bioinformatics: from genome to phenome. , 2004, Trends in biotechnology.

[23]  Elena Litchman,et al.  Mighty small: Observing and modeling individual microbes becomes big science , 2013, Proceedings of the National Academy of Sciences.

[24]  J. Handelsman,et al.  Toward functional genomics in bacteria: analysis of gene expression in Escherichia coli from a bacterial artificial chromosome library of Bacillus cereus. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[25]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[26]  J. Lederberg,et al.  `Ome Sweet `Omics--A Genealogical Treasury of Words , 2001 .

[27]  Michael S. Waterman,et al.  Introduction to computational biology , 1995 .

[28]  Ramon Massana,et al.  Study of Genetic Diversity of Eukaryotic Picoplankton in Different Oceanic Regions by Small-Subunit rRNA Gene Cloning and Sequencing , 2001, Applied and Environmental Microbiology.

[29]  Royston Goodacre,et al.  Metabolomic technologies and their application to the study of plants and plant-host interactions. , 2007, Physiologia plantarum.

[30]  Heribert Cypionka,et al.  Microbial Diversity in Coastal Subsurface Sediments: a Cultivation Approach Using Various Electron Acceptors and Substrate Gradients , 2005, Applied and Environmental Microbiology.

[31]  H. Meyer,et al.  Bioinformatics in proteomics. , 2004, Current pharmaceutical biotechnology.

[32]  David A. Fenstermacher,et al.  Introduction to bioinformatics , 2005, J. Assoc. Inf. Sci. Technol..

[33]  G. Siuzdak,et al.  Innovation: Metabolomics: the apogee of the omics trilogy , 2012, Nature Reviews Molecular Cell Biology.

[34]  Marc W Kirschner,et al.  The Meaning of Systems Biology , 2005, Cell.

[35]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[36]  A. N. Spiridonov,et al.  Congruent evolution of different classes of non-coding DNA in prokaryotic genomes. , 2002, Nucleic acids research.

[37]  Vladimir Shulaev,et al.  Metabolomics technology and bioinformatics , 2006, Briefings Bioinform..

[38]  J. Handelsman Metagenomics: Application of Genomics to Uncultured Microorganisms , 2004, Microbiology and Molecular Biology Reviews.

[39]  Vasileios Megalooikonomou,et al.  Similarity Searching of Medical Image Data in Distributed Systems: Facilitating Telemedicine Applications , 2011, Int. J. Comput. Model. Algorithms Medicine.

[40]  D. Josić,et al.  Application of proteomics in biotechnology – Microbial proteomics , 2008, Biotechnology journal.

[41]  Christos A. Ouzounis Bioinformatics and the theoretical foundations of molecular biology , 2002, Bioinform..

[42]  Sutapa Bose,et al.  A Broader View: Microbial Enzymes and Their Relevance in Industries, Medicine, and Beyond , 2013, BioMed research international.

[43]  Luonan Chen,et al.  Computational systems biology in the big data era , 2013, BMC Systems Biology.

[44]  S. Quake,et al.  Single-cell genomics , 2011, Nature Methods.

[45]  Julian Parkhill,et al.  Microbiology in the post-genomic era , 2008, Nature Reviews Microbiology.

[46]  D. Hochstrasser,et al.  From Proteins to Proteomes: Large Scale Protein Identification by Two-Dimensional Electrophoresis and Arnino Acid Analysis , 1996, Bio/Technology.

[47]  G. Dougan,et al.  Routine Use of Microbial Whole Genome Sequencing in Diagnostic and Public Health Microbiology , 2012, PLoS pathogens.

[48]  Veljo Kisand,et al.  Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools , 2013, BMC Genomics.

[49]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[50]  R. Daniel,et al.  Metagenomic Analyses: Past and Future Trends , 2010, Applied and Environmental Microbiology.

[51]  Daniel Luis Notari,et al.  Dis2PPI: A Workflow Designed to Integrate Proteomic and Genetic Disease Data , 2012, Int. J. Knowl. Discov. Bioinform..

[52]  F. Fernández,et al.  Microbial Secondary Metabolites Production and Strain Improvement , 2003 .

[53]  Royston Goodacre,et al.  TARDIS-based microbial metabolomics: time and relative differences in systems. , 2011, Trends in microbiology.

[54]  J. Stelling Mathematical models in microbial systems biology. , 2004, Current opinion in microbiology.

[55]  R. Graham,et al.  Microbial proteomics: a mass spectrometry primer for biologists , 2007, Microbial cell factories.

[56]  H. Wirth,et al.  Analysis of large-scale molecular biological data using self-organizing maps , 2012 .

[57]  T. Hankemeier,et al.  Metabolomics-based systems biology and personalized medicine: moving towards n = 1 clinical trials? , 2006, Pharmacogenomics.

[58]  David A. Rasko,et al.  Bacterial genome sequencing in the clinic: bioinformatic challenges and solutions , 2013, Nature Reviews Genetics.

[59]  Mario Cannataro Computational proteomics: management and analysis of proteomics data , 2008, Briefings Bioinform..

[60]  C. Sander,et al.  Challenging times for bioinformatics , 1995, Nature.

[61]  T. Ideker,et al.  A new approach to decoding life: systems biology. , 2001, Annual review of genomics and human genetics.

[62]  Roy D. Welch,et al.  Practical Applications of Bacterial Functional Genomics , 2007, Biotechnology & genetic engineering reviews.

[63]  Roy D. Sleator,et al.  'Big data', Hadoop and cloud computing in genomics , 2013, J. Biomed. Informatics.

[64]  Y. Kamagata,et al.  Cultivation of Uncultured Fastidious Microbes , 2005 .

[65]  Aryya Gangopadhyay,et al.  Methods, Models, and Computation for Medical Informatics , 2012 .

[66]  Christoph Steinbeck,et al.  Computational metabolomics – a field at the boundaries of cheminformatics and bioinformatics , 2011, J. Cheminformatics.

[67]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[68]  Dong Xu,et al.  Bioinformatics and its applications in plant biology. , 2006, Annual review of plant biology.

[69]  Yinjie J. Tang,et al.  Separation and mass spectrometry in microbial metabolomics. , 2008, Current opinion in microbiology.

[70]  Anne M. Evans,et al.  Organization of GC/MS and LC/MS metabolomics data into chemical libraries , 2010, J. Cheminformatics.

[71]  E. Delong,et al.  Microbial population genomics and ecology: the road ahead. , 2004, Environmental microbiology.

[72]  Jonathan A. Eisen,et al.  Microbial genome sequencing , 2000, Nature.

[73]  Karen E Nelson,et al.  The future of microbial genomics. , 2003, Environmental microbiology.

[74]  Ratna Prabha,et al.  Bioinformatics-Assisted Microbiological Research: Tasks, Developments and Upcoming Challenges , 2012 .

[75]  S. Salzberg,et al.  The Value of Complete Microbial Genome Sequencing (You Get What You Pay For) , 2002, Journal of bacteriology.

[76]  R. Gerszten,et al.  Targeted Metabolomics , 2012, Current protocols in molecular biology.

[77]  J. D. Watson The human genome project: past, present, and future. , 1990, Science.

[78]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[79]  S Falkow,et al.  Microbial pathogenesis: genomics and beyond. , 1997, Science.

[80]  N. Goodman Biological data becomes computer literate: new advances in bioinformatics. , 2002, Current opinion in biotechnology.

[81]  Christoph Steinbeck,et al.  So what have data standards ever done for us? The view from metabolomics , 2010, Genome Medicine.