论文信息 - The Personal Sequence Database: a suite of tools to create and maintain web-accessible sequence databases

The Personal Sequence Database: a suite of tools to create and maintain web-accessible sequence databases

BackgroundLarge molecular sequence databases are fundamental resources for modern bioscientists. Whether for project-specific purposes or sharing data with colleagues, it is often advantageous to maintain smaller sequence databases. However, this is usually not an easy task for the average bench scientist.ResultsWe present the Personal Sequence Database (PSD), a suite of tools to create and maintain small- to medium-sized web-accessible sequence databases. All interactions with PSD tools occur via the internet with a web browser. Users may define sequence groups within their database that can be maintained privately or published to the web for public use. A sequence group can be downloaded, browsed, searched by keyword or searched for sequence similarities using BLAST. Publishing a sequence group extends these capabilities to colleagues and collaborators. In addition to being able to manage their own sequence databases, users can enroll sequences in BLASTAgent, a BLAST hit tracking system, to monitor NCBI databases for new entries displaying a specified level of nucleotide or amino acid similarity.ConclusionThe PSD offers a valuable set of resources unavailable elsewhere. In addition to managing sequence data and BLAST search results, it facilitates data sharing with colleagues, collaborators and public users. The PSD is hosted by the authors and is available at http://bioinfo.cgrb.oregonstate.edu/psd/.

Christopher M. Sullivan | Scott A. Givan | James C. Carrington

[1] Evelyn Camon,et al. The EMBL Nucleotide Sequence Database , 2004, Nucleic acids research.

[2] David C. Nickle,et al. ViroBLAST: a stand-alone BLAST web server for flexible queries of multiple databases and user's datasets , 2007, Bioinform..

[3] Ncbi. National Center for Biotechnology Information , 2008 .

[4] J. Thompson,et al. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[5] X. Huang,et al. CAP3: A DNA sequence assembly program. , 1999, Genome research.

[6] Ralph A. Dean,et al. Alkahest NuclearBLAST : a user-friendly BLAST management and analysis system , 2004, BMC Bioinformatics.

[7] W R Pearson. Using the FASTA program to search protein and DNA sequence databases. , 1994, Methods in molecular biology.

[8] Patrick Xuechun Zhao,et al. PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results , 2007, BMC Bioinformatics.

[9] Thomas Soddemann,et al. The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis , 2006, Nucleic Acids Res..

[10] Olivier Martin,et al. MyHits: improvements to an interactive resource for analyzing protein sequences , 2007, Nucleic Acids Res..

[11] Roy T. Fielding,et al. The Apache HTTP Server Project , 1997, IEEE Internet Comput..

[12] Narmada Thanki,et al. CDD: a conserved domain database for interactive domain family analysis , 2006, Nucleic Acids Res..

[13] W R Pearson,et al. Using the FASTA program to search protein and DNA sequence databases. , 1994, Methods in molecular biology.

[14] Chris Upton,et al. Recent Hits Acquired by BLAST (ReHAB): A tool to identify new hits in sequence similarity searches , 2004, BMC Bioinformatics.

[15] Matthew R. Pocock,et al. The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[16] Jianwen Fang,et al. Tracker: continuous HMMER and BLAST searching , 2005, Bioinform..

[17] Midori A. Harris,et al. The Gene Ontology project , 2005 .