BioWare : A framework for bioinformatics data retrieval , annotation and publishing

In-depth analysis about a specific subject in molecular biology, specifically those associated with the structural and functional properties of a particular group of sequences typically requires access to an extensive knowledge base. The knowledge base may take the form of a specialist database or subject-specific data warehouse (SSDW) to facilitate the organisation of specialized data and the extraction of new knowledge. These SSDWs are particularly useful for data mining or knowledge discovery processes which require the relevant information from multiple data sources. The construction of a specialist database is a multistep process which typically involves enrichment of annotations (by domain experts), development and integration of analytical tools (by computer programmers), and construction of the system (by database experts). The SSDWs contain focused subsets of data compiled from multiple data sources and enriched with user annotations. In this article we present and describe the BioWare system which enables its users to collect, annotate, publish, and update specialized molecular data in personalized WWWaccessible databases. BioWare contains four data warehouse enabling components: (i) BioWare-Retrieve searches and extracts data from selected sources and integrates them into a standardized format, (ii) BioWare-Prep provides a semi-automated mechanism for user-driven cleaning, preliminary analysis and annotation of the data, (iii) TEMPLAR enables users to rapidly create searchable WWW-accessible SSDWs, and (iv) BioWare-Update enables incremental updating of the SSDWs with new data from the sources. We have used BioWare system for the creation and maintenance of several bioinformatic databases.

[1]  Peter B. McGarvey,et al.  Protein Information Resource: a community resource for expert annotation of protein data , 2001, Nucleic Acids Res..

[2]  Ron D. Appel,et al.  ExPASy: the proteomics server for in-depth protein knowledge and analysis , 2003, Nucleic Acids Res..

[3]  Seng Hong Seah,et al.  SCORPION, a molecular database of scorpion toxins. , 2002, Toxicon : official journal of the International Society on Toxinology.

[4]  Werner Braun,et al.  SDAP: database and computational tools for allergenic proteins , 2003, Nucleic Acids Res..

[5]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2004, Nucleic acids research.

[6]  Vladimir Brusic,et al.  Data Warehousing in Molecular Biology , 2000, Briefings Bioinform..

[7]  Maciej Szymanski,et al.  Aminoacyl-tRNA synthetases database Y2K , 2000, Nucleic Acids Res..

[8]  Sean R. Eddy,et al.  The Distributed Annotation System , 2001, BMC Bioinformatics.

[9]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[10]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[11]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[12]  Peter Liggesmeyer,et al.  Generating optimal distinguishing sequences with a model checker , 2005, A-MOST.

[13]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[14]  Vladimir Brusic,et al.  Bioinformatics for Venom and Toxin Sciences , 2003, Briefings Bioinform..

[15]  Chris Sander,et al.  MView: a web-compatible database search or multiple alignment viewer , 1998, Bioinform..

[16]  Zhe Wang,et al.  APD: the Antimicrobial Peptide Database , 2004, Nucleic Acids Res..

[17]  Raquel Tobes,et al.  AraC-XylS database: a family of positive transcriptional regulators in bacteria , 2002, Nucleic Acids Res..

[18]  Vladimir Brusic,et al.  CysView: protein classification based on cysteine pairing patterns , 2004, Nucleic Acids Res..

[19]  Rolf Apweiler,et al.  The EBI SRS server-new features , 2002, Bioinform..

[20]  Tin Wee Tan,et al.  ANTIMIC: a database of antimicrobial sequences , 2004, Nucleic Acids Res..

[21]  Hideaki Sugawara,et al.  DBJ in the stream of various biological data , 2004, Nucleic Acids Res..

[22]  Jean-Christophe Gelly,et al.  The KNOTTIN website and database: a new information system dedicated to the knottin scaffold , 2004, Nucleic Acids Res..

[23]  Monica Riley,et al.  GenProtEC: an updated and improved analysis of functions of Escherichia coli K-12 proteins , 2004, Nucleic Acids Res..

[24]  Philip E. Bourne,et al.  The distribution and query systems of the RCSB Protein Data Bank , 2004, Nucleic Acids Res..