RAId_DbS: mass-spectrometry based peptide identification web server with knowledge integration

BackgroundExisting scientific literature is a rich source of biological information such as disease markers. Integration of this information with data analysis may help researchers to identify possible controversies and to form useful hypotheses for further validations. In the context of proteomics studies, individualized proteomics era may be approached through consideration of amino acid substitutions/modifications as well as information from disease studies. Integration of such information with peptide searches facilitates speedy, dynamic information retrieval that may significantly benefit clinical laboratory studies.DescriptionWe have integrated from various sources annotated single amino acid polymorphisms, post-translational modifications, and their documented disease associations (if they exist) into one enhanced database per organism. We have also augmented our peptide identification software RAId_DbS to take into account this information while analyzing a tandem mass spectrum. In principle, one may choose to respect or ignore the correlation of amino acid polymorphisms/modifications within each protein. The former leads to targeted searches and avoids scoring of unnecessary polymorphism/modification combinations; the latter explores possible polymorphisms in a controlled fashion. To facilitate new discoveries, RAId_DbS also allows users to conduct searches permitting novel polymorphisms as well as to search a knowledge database created by the users.ConclusionWe have finished constructing enhanced databases for 17 organisms. The web link to RAId_DbS and the enhanced databases is http://www.ncbi.nlm.nih.gov/CBBResearch/qmbp/RAId_DbS/index.html. The relevant databases and binaries of RAId_DbS for Linux, Windows, and Mac OS X are available for download from the same web page.

[1]  Marie-Claude Blatter,et al.  Protein variety and functional diversity: Swiss-Prot annotation in its biological context. , 2005, Comptes rendus biologies.

[2]  Arndt von Haeseler,et al.  C-->U editing of apolipoprotein B mRNA in marsupials: identification and characterisation of APOBEC-1 from the American opossum Monodelphus domestica , 1999, Nucleic Acids Res..

[3]  Ming Li,et al.  PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. , 2003, Rapid communications in mass spectrometry : RCM.

[4]  Vip Viprakasit,et al.  A Regulatory SNP Causes a Human Genetic Disease by Creating a New Transcriptional Promoter , 2006, Science.

[5]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[6]  Blagoy Blagoev,et al.  A mass spectrometry–friendly database for cSNP identification , 2007, Nature Methods.

[7]  Emidio Capriotti,et al.  Bioinformatics Original Paper Predicting the Insurgence of Human Genetic Diseases Associated to Single Point Protein Mutations with Support Vector Machines and Evolutionary Information , 2022 .

[8]  Maureen Kachman,et al.  Validated MALDI-TOF/TOF mass spectra for protein standards , 2007, Journal of the American Society for Mass Spectrometry.

[9]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[10]  Yi-Kuo Yu,et al.  Calibrating E-values for MS2 database search methods , 2007, Biology Direct.

[11]  L. Brooks,et al.  A DNA polymorphism discovery resource for research on human genetic variation. , 1998, Genome research.

[12]  N. Edwards,et al.  Novel peptide identification from tandem mass spectra using ESTs and sequence database compression , 2007, Molecular systems biology.

[13]  L. Comtet,et al.  Advanced Combinatorics: The Art of Finite and Infinite Expansions , 1974 .

[14]  A. Valencia,et al.  A text‐mining perspective on the requirements for electronically annotated abstracts , 2008, FEBS letters.

[15]  B. McLaughlin,et al.  Killer Proteases and Little Strokes—How the Things that do not Kill You Make You Stronger , 2007, Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism.

[16]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[17]  Anne-Lise Veuthey,et al.  Annotation of glycoproteins in the SWISS‐PROT database , 2001 .

[18]  Francis S. Collins,et al.  Erratum: A DNA polymorphism discovery resource for research on human genetic variation (Genome Research (1998) 8 (1229-1231)) , 1999 .

[19]  Luana Licata,et al.  Linking entries in protein interaction database to structured text: The FEBS Letters experiment , 2008, FEBS letters.

[20]  Yi-Kuo Yu,et al.  RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics , 2007, Biology Direct.

[21]  L. Feuk,et al.  SNP association studies in Alzheimer's disease highlight problems for complex disease analysis. , 2001, Trends in genetics : TIG.