IBDsite: a Galaxy-interacting, integrative database for supporting inflammatory bowel disease high throughput data analysis

BackgroundInflammatory bowel diseases (IBD) refer to a group of inflammatory conditions concerning colon and small intestine, which cause socially uncomfortable symptoms and often are associated with an increased risk of colon cancer. IBD are complex disorders, which rely on genetic susceptibility, environmental factors, deregulation of the immune system, and host relationship with commensal microbiota. The complexity of these pathologies makes difficult to clearly understand the mechanisms of their onset. Therefore, the study of IBD must be faced exploiting an integrated and multilevel approach, ranging from genes, transcripts and proteins to pathways altered in affected tissues, and carefully considering their regulatory mechanisms, which may intervene in the pathology onset. It is also crucial to have a knowledge base about the symbiotic bacteria that are hosted in the human gut. To date, much data exist regarding IBD and human commensal bacteria, but this information is sparse in literature and no free resource provides a homogeneously and rationally integrated view of biomolecular data related to these pathologies.MethodsHuman genes altered in IBD have been collected from literature, paying particular interest for the immune system alterations prompted by the interaction with the gut microbiome. This process has been performed manually to assure the reliability of collected data. Heterogeneous metadata from different sources have been automatically formatted and integrated in order to enrich information about these altered genes. A user-friendly web interface has been created for easy access to structured data. Tools such as gene clustering coefficients, all-pairs shortest paths and pathway lengths calculation have been developed to provide data analysis support. Moreover, the implemented resource is compliant to the Galaxy framework, allowing the collected data to be exploited in the context of high throughput bioinformatics analysis.ResultsTo fill the lack of a reference resource for 'omics' science analysis in the context of IBD, we developed the IBDsite (available at http://www.itb.cnr.it/ibd), a disease-oriented platform, which collects data related to biomolecular mechanisms involved in the IBD onset. The resource provides a section devoted to human genes identified as altered in IBD, which can be queried at different biomolecular levels and visualised in gene-centred report pages. Furthermore, the system presents information related to the gut microbiota involved in IBD affected patients. The IBDsite is compliant with all Galaxy installations (in particular, it can be accessed from our custom version of Galaxy, http://www.itb.cnr.it/galaxy), in order to facilitate high-throughput data integration and to enable evaluations of the genomic basis of these diseases, complementing the tools embedded in the IBDsite.ConclusionsLots of sparse data exist concerning IBD studies, but no on-line resource homogeneously and rationally integrate and collect them. The IBDsite is an attempt to group available information regarding human genes and microbial aspects related to IBD, by means of a multilevel mining tool. Moreover, it constitutes a knowledge base to filter, annotate and understand new experimental data in order to formulate new scientific hypotheses, thanks to the possibility of integrating genomics aspects by employing the Galaxy framework. Discussed use-cases demonstrate that the developed system is useful to infer not trivial knowledge from the existing widespread data or from novel experiments.

[1]  J. Kere,et al.  Update on SLC26A3 mutations in congenital chloride diarrhea , 2011, Human mutation.

[2]  A. Sharpe,et al.  CD48 controls T-cell and antigen-presenting cell functions in experimental colitis. , 2006, Gastroenterology.

[3]  Ting Wang,et al.  The UCSC Genome Browser Database: update 2009 , 2008, Nucleic Acids Res..

[4]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[5]  M. Neurath,et al.  IL-6 Signaling Promotes Tumor Growth in Colorectal Cancer , 2005, Cell cycle.

[6]  Ibrahim Emam,et al.  ArrayExpress update—from an archive of functional genomics experiments to the atlas of gene expression , 2008, Nucleic Acids Res..

[7]  Y. Sanz,et al.  Gut Microbiota Dysbiosis Is Associated with Inflammation and Bacterial Translocation in Mice with CCl4-Induced Fibrosis , 2011, PloS one.

[8]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Lee J. Bain,et al.  Experiment Size and Power Comparisons for Two‐Sample Poisson Tests , 1982 .

[10]  B. Pasche,et al.  Tgf-beta signaling alterations and colon cancer. , 2010, Cancer treatment and research.

[11]  Fiona Powrie,et al.  Intestinal homeostasis and its breakdown in inflammatory bowel disease , 2011, Nature.

[12]  Louis Vermeulen,et al.  Microenvironmental regulation of stem cells in intestinal homeostasis and cancer , 2011, Nature.

[13]  G. Kaplan Editorial: Administrative Database Studies in IBD: A Cautionary Tale , 2010, The American Journal of Gastroenterology.

[14]  O. Yokosuka,et al.  Analysis of the methylation status of genes up-regulated by the demethylating agent, 5-aza-2'-deoxycytidine, in esophageal squamous cell carcinoma. , 2008, Oncology reports.

[15]  Ivan Merelli,et al.  myMIR: a genome-wide microRNA targets identification and annotation tool , 2011, Briefings Bioinform..

[16]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[17]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[18]  P. Lakatos,et al.  Risk for colorectal cancer in ulcerative colitis: changes, causes and management strategies. , 2008, World journal of gastroenterology.

[19]  Alastair Forbes,et al.  Genetic determinants of ulcerative colitis include the ECM1 locus and five loci implicated in Crohn's disease , 2008, Nature Genetics.

[20]  J. Silberg,et al.  A transposase strategy for creating libraries of circularly permuted proteins , 2012, Nucleic acids research.

[21]  Anton Nekrutenko,et al.  Integrating diverse databases into an unified analysis framework: a Galaxy approach , 2011, Database J. Biol. Databases Curation.

[22]  J. Imura,et al.  DMBT1 is a novel gene induced by IL‐22 in ulcerative colitis , 2011, Inflammatory bowel diseases.

[23]  M. Sata,et al.  Therapeutic strategies for targeting the IL-6/STAT3 cytokine signaling pathway in inflammatory bowel disease. , 2007, Anticancer research.

[24]  N. Mukaida,et al.  Blocking TNF-alpha in mice reduces colorectal carcinogenesis associated with chronic colitis. , 2008, The Journal of clinical investigation.

[25]  Michael D. Huffman An Improved Approximate Two‐Sample Poisson Test , 1984 .

[26]  C. Elia,et al.  Apoptosis in the intestinal mucosa of patients with inflammatory bowel disease: evidence of altered expression of FasL and perforin cytotoxic pathways , 2005, International Journal of Colorectal Disease.

[27]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[28]  C. Neut,et al.  Self inflicted rectal ulcer: hearing is believing , 2003, Gut.

[29]  Kiyoko F. Aoki-Kinoshita,et al.  Gene annotation and pathway mapping in KEGG. , 2007, Methods in molecular biology.

[30]  Yusuke Nakamura,et al.  A genome-wide association study identifies three new susceptibility loci for ulcerative colitis in the Japanese population , 2009, Nature Genetics.

[31]  Wei-Min Liu,et al.  Robust estimators for expression analysis , 2002, Bioinform..

[32]  G. Parmigiani,et al.  Genome‐wide gene expression differences in Crohn's disease and ulcerative colitis from endoscopic pinch biopsies: Insights into distinctive pathogenesis , 2007, Inflammatory bowel diseases.

[33]  The UniProt Consortium,et al.  Reorganizing the protein space at the Universal Protein Resource (UniProt) , 2011, Nucleic Acids Res..

[34]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2008 update , 2008, Nucleic Acids Res..

[35]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[36]  C. Edwards,et al.  A United Kingdom inflammatory bowel disease database: making the effort worthwhile. , 2010, Journal of Crohn's & colitis.

[37]  E. Szigethy,et al.  Inflammatory bowel disease. , 2011, Pediatric clinics of North America.

[38]  Tariq Ahmad,et al.  Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci , 2010, Nature Genetics.

[39]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[41]  E. Ahrens,et al.  Inflammation Driven by Overexpression of the Hypoglycosylated Abnormal Mucin 1 (MUC1) Links Inflammatory Bowel Disease and Pancreatitis , 2010, Pancreas.

[42]  N. Pace,et al.  Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases , 2007, Proceedings of the National Academy of Sciences.