Integration of Biomolecular Interaction Data in a Genomic and Proteomic Data Warehouse to Support Biomedical Knowledge Discovery

The growing available genomic and proteomic information gives new opportunities for novel research approaches and biomedical discoveries through effective data management and analysis support. Integration and comprehensive evaluation of available controlled data can highlight information patterns leading to unveil new biomedical knowledge. For this purpose, the University Politecnico di Milano, is developing a software framework to create and maintain a Genomic and Proteomic Data Warehouse (GPDW) that integrates information from many data sources on the basis of a conceptual data model that relates molecular entities and biomedical features.

[1]  Carlo Batini,et al.  Data Quality: Concepts, Methodologies and Techniques , 2006, Data-Centric Systems and Applications.

[2]  Henning Hermjakob,et al.  Submit Your Interaction Data the IMEx Way , 2007, Proteomics.

[3]  Francesco Pinciroli,et al.  GFINDer: genetic disease and phenotype location statistical analysis and mining of dynamically annotated gene lists , 2005, Nucleic Acids Res..

[4]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[5]  C. Sander,et al.  The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data , 2004, Nature Biotechnology.

[6]  E. Birney,et al.  EnsMart: a generic system for fast and flexible access to biological data. , 2003, Genome research.

[7]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[8]  Caleb Webber,et al.  SCANPS: a web server for iterative protein sequence database searching by dynamic programing, with display in a hierarchical SCOP browser , 2008, Nucleic Acids Res..

[9]  Livia Perfetto,et al.  MINT, the molecular interaction database: 2009 update , 2009, Nucleic Acids Res..

[10]  Hideaki Sugawara,et al.  DDBJ with new system and face , 2007, Nucleic Acids Res..

[11]  Bin Liu,et al.  Michigan Molecular Interactions (MiMI): putting the jigsaw puzzle together , 2006, Nucleic Acids Res..

[12]  Tin Wee Tan,et al.  Large-scale analysis of antigenic diversity of T-cell epitopes in dengue virus , 2006, BMC Bioinformatics.

[13]  Priyanka Gupta,et al.  BioWarehouse: a bioinformatics database warehouse toolkit , 2006, BMC Bioinformatics.

[14]  Francesco Pinciroli,et al.  GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining , 2004, Nucleic Acids Res..

[15]  Gary D Bader,et al.  BMC Biology BioMed Central , 2007 .

[16]  Carlo Batini,et al.  Methodologies for data quality assessment and improvement , 2009, CSUR.

[17]  Marco Masseroli,et al.  Quality controls in integrative approaches to detect errors and inconsistencies in biological databases , 2010, J. Integr. Bioinform..

[18]  Dan Wu,et al.  EMBL Nucleotide Sequence Database in 2006 , 2006, Nucleic Acids Res..

[19]  Ron D. Appel,et al.  ExPASy: the proteomics server for in-depth protein knowledge and analysis , 2003, Nucleic Acids Res..

[20]  Maria Victoria Schneider,et al.  MINT: a Molecular INTeraction database. , 2002, FEBS letters.

[21]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[22]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[23]  The UniProt Consortium,et al.  The Universal Protein Resource (UniProt) 2009 , 2008, Nucleic Acids Res..

[24]  Stuart E. Madnick,et al.  Overview and Framework for Data and Information Quality Research , 2009, JDIQ.

[25]  Alan F. Scott,et al.  McKusick's Online Mendelian Inheritance in Man (OMIM®) , 2008, Nucleic Acids Res..