Databases and Bioinformatics Tools for Data Mining

Data, information, and knowledge play an interesting role in human life. Huge repositories of data generated because of the recent development of technologies demand the development of novel tools and techniques that can retrieve more important information. Data mining is a kind of knowledge discovery technique that extracts useful information from heterogeneous biological data by employing various machine learning, artificial intelligent systems, and decision-making techniques. Thus, in this chapter, the authors attempted to understand how data mining approaches have revolutionized biological research. The topic of data mining is discussed in brief, including the application it has in bioinformatics. This chapter also illustrates some of the emerging problems and opportunities in data mining in bioinformatics by utilizing this analogy.

[1]  Vladimir Batagelj,et al.  Analysis and visualization of large networks with program package Pajek , 2016, Complex Adapt. Syst. Model..

[2]  Jun Yu,et al.  Rice Genomics: over the Past Two Decades and into the Future , 2018, Genom. Proteom. Bioinform..

[3]  Igor Jurisica,et al.  Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions , 2014, BMC Bioinformatics.

[4]  Henning Hermjakob,et al.  The Reactome pathway Knowledgebase , 2015, Nucleic acids research.

[5]  L. Avery,et al.  Ordering gene function: the interpretation of epistasis in regulatory hierarchies. , 1992, Trends in genetics : TIG.

[6]  Peter D. Karp,et al.  The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases , 2015, Nucleic Acids Res..

[7]  Alexander R. Pico,et al.  WikiPathways: Pathway Editing for the People , 2008, PLoS biology.

[8]  Gurkan Bebek,et al.  Identifying gene interaction networks. , 2012, Methods in molecular biology.

[9]  Mick Watson,et al.  The rumen microbial metagenome associated with high methane production in cattle , 2015, BMC Genomics.

[10]  William C Hahn,et al.  Identification of genotype-selective antitumor agents using synthetic lethal chemical screening in engineered human tumor cells. , 2003, Cancer cell.

[11]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[12]  Jiong Yang,et al.  PathFinder: mining signal transduction pathway segments from protein-protein interaction networks , 2007, BMC Bioinformatics.

[13]  Cathy H. Wu,et al.  Protein Bioinformatics Databases and Resources. , 2017, Methods in molecular biology.

[14]  Lincoln Stein,et al.  Using the Reactome Database , 2004, Current protocols in bioinformatics.

[15]  Martin Kuiper,et al.  BiNGO: a Cytoscape plugin to assess overrepresentation of Gene Ontology categories in Biological Networks , 2005, Bioinform..

[16]  Robin Haw,et al.  Using the Reactome Database , 2012, Current protocols in bioinformatics.

[17]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[18]  Chris T. A. Evelo,et al.  WikiPathways: building research communities on biological pathways , 2011, Nucleic Acids Res..

[19]  Chern-Sing Goh,et al.  Co-evolutionary analysis reveals insights into protein-protein interactions. , 2002, Journal of molecular biology.

[20]  Mick Watson,et al.  A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data , 2017, Front. Genet..

[21]  M. Tyers,et al.  Osprey: a network visualization system , 2003, Genome Biology.

[22]  S. Salzberg,et al.  Bioinformatics challenges of new sequencing technology. , 2008, Trends in genetics : TIG.

[23]  J. Cañizares,et al.  Application of Genomic Tools in Plant Breeding , 2012, Current genomics.

[24]  P. Jaiswal,et al.  Databases and bioinformatics tools for rice research , 2016 .

[25]  Hidemasa Bono All of gene expression (AOE): An integrated index for public gene expression databases , 2020, PloS one.

[26]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[27]  Osamu Ogasawara,et al.  DDBJ update: the Genomic Expression Archive (GEA) for functional genomics data , 2018, Nucleic Acids Res..

[28]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Adane Nega Tarekegn,et al.  Fundamentals of Database System , 2017 .

[30]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[31]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[32]  Yoshihiro Kawahara,et al.  Rice Annotation Project Database (RAP-DB): An Integrative and Interactive Database for Rice Genomics , 2013, Plant & cell physiology.

[33]  James C. Hu,et al.  The Gene Ontology Resource: 20 years and still GOing strong , 2019 .

[34]  Andrew M. Jenkinson,et al.  The EBI RDF platform: linked open data for the life sciences , 2014, Bioinform..

[35]  B. Burr,et al.  International Rice Genome Sequencing Project: the effort to completely sequence the rice genome. , 2000, Current opinion in plant biology.