BioParser: a tool for processing of sequence similarity analysis reports.

UNLABELLED The widely used programs BLAST (in this article, 'BLAST' includes both the National Center for Biotechnology Information [NCBI] BLAST and the Washington University version WU BLAST) and FASTA for similarity searches in nucleotide and protein databases usually result in copious output. However, when large query sets are used, human inspection rapidly becomes impractical. BioParser is a Perl program for parsing BLAST and FASTA reports. Making extensive use of the BioPerl toolkit, the program filters, stores and returns components of these reports in either ASCII or HTML format. BioParser is also capable of automatically feeding a local MySQL database with the parsed information, allowing subsequent filtering of hits and/or alignments with specific attributes. For this reason, BioParser is a valuable tool for large-scale similarity analyses by improving the access to the information present in BLAST or FASTA reports, facilitating extraction of useful information of large sets of sequence alignments, and allowing for easy handling and processing of the data. AVAILABILITY BioParser is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 2.0 license terms (http://creativecommons.org/licenses/by-nc-nd/2.0/) and is available upon request. Additional information can be found at the BioParser website (http://www.dbbm.fiocruz.br/BioParser.html).

[1]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[2]  Temple F. Smith,et al.  Comparison of biosequences , 1981 .

[3]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[4]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Volker Brendel,et al.  Multi-query sequence BLAST output examination with MuSeqBox , 2001, Bioinform..

[6]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[7]  Sergio Verjovski-Almeida,et al.  Zerg: A Very Fast BLAST Parser Library , 2003, Bioinform..

[8]  Ian T. Paulsen,et al.  TransportDB: a relational database of cellular membrane transport systems , 2004, Nucleic Acids Res..

[9]  W. Pearson Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.

[10]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.