OAHG: an integrated resource for annotating human genes with multi-level ontologies

OAHG, an integrated resource, aims to establish a comprehensive functional annotation resource for human protein-coding genes (PCGs), miRNAs, and lncRNAs by multi-level ontologies involving Gene Ontology (GO), Disease Ontology (DO), and Human Phenotype Ontology (HPO). Many previous studies have focused on inferring putative properties and biological functions of PCGs and non-coding RNA genes from different perspectives. During the past several decades, a few of databases have been designed to annotate the functions of PCGs, miRNAs, and lncRNAs, respectively. A part of functional descriptions in these databases were mapped to standardize terminologies, such as GO, which could be helpful to do further analysis. Despite these developments, there is no comprehensive resource recording the function of these three important types of genes. The current version of OAHG, release 1.0 (Jun 2016), integrates three ontologies involving GO, DO, and HPO, six gene functional databases and two interaction databases. Currently, OAHG contains 1,434,694 entries involving 16,929 PCGs, 637 miRNAs, 193 lncRNAs, and 24,894 terms of ontologies. During the performance evaluation, OAHG shows the consistencies with existing gene interactions and the structure of ontology. For example, terms with more similar structure could be associated with more associated genes (Pearson correlation γ2 = 0.2428, p < 2.2e–16).

[1]  Damian Smedley,et al.  The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data , 2014, Nucleic Acids Res..

[2]  Yadong Wang,et al.  Measuring semantic similarities by combining gene ontology annotations and gene co-function networks , 2015, BMC Bioinformatics.

[3]  Mark A. Musen,et al.  The Open Biomedical Annotator , 2009, Summit on translational bioinformatics.

[4]  Hui Zhou,et al.  starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein–RNA interaction networks from large-scale CLIP-Seq data , 2013, Nucleic Acids Res..

[5]  E. Snitkin,et al.  Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network , 2009, Genome Biology.

[6]  Jiajie Peng,et al.  InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology , 2016, BMC Genomics.

[7]  Xiangxiang Zeng,et al.  Inferring MicroRNA-Disease Associations by Random Walk on a Heterogeneous Network with Multiple Data Sources , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  P. Robinson,et al.  The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. , 2008, American journal of human genetics.

[9]  Haixiu Yang,et al.  A potential signature of eight long non-coding RNAs predicts survival in patients with non-small cell lung cancer , 2015, Journal of Translational Medicine.

[10]  Lin Liu,et al.  Inferring novel lncRNA-disease associations based on a random walk model of a lncRNA functional similarity network. , 2014, Molecular bioSystems.

[11]  S. Mundlos,et al.  The Human Phenotype Ontology , 2010, Clinical genetics.

[12]  Dong Wang,et al.  Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases , 2010, Bioinform..

[13]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[14]  Maoni Guo,et al.  A potential prognostic long non-coding RNA signature to predict metastasis-free survival of breast cancer patients , 2015, Scientific Reports.

[15]  Yang Li,et al.  HMDD v2.0: a database for experimentally supported human microRNA and disease associations , 2013, Nucleic Acids Res..

[16]  Qionghai Dai,et al.  Constructing lncRNA functional similarity network based on lncRNA-disease associations and disease semantic similarity , 2015, Scientific Reports.

[17]  Haixiu Yang,et al.  IntNetLncSim: an integrative network analysis method to infer human lncRNA functional similarity , 2016, Oncotarget.

[18]  Marcel E. Dinger,et al.  lncRNAdb v2.0: expanding the reference database for functional long noncoding RNAs , 2014, Nucleic Acids Res..

[19]  Q. Zou,et al.  Similarity computation strategies in the microRNA-disease network: a survey. , 2015, Briefings in functional genomics.

[20]  Yadong Wang,et al.  Extending gene ontology with gene association networks , 2016, Bioinform..

[21]  Yanying Sun,et al.  Comprehensive analysis of lncRNA expression profiles reveals a novel lncRNA signature to discriminate nonequivalent outcomes in patients with ovarian cancer , 2016, Oncotarget.

[22]  Sudhir Kumar,et al.  Medical subject headings (MeSH) terms , 2014, Indian journal of orthopaedics.

[23]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[24]  Xiangxiang Zeng,et al.  Integrative approaches for predicting microRNA function and prioritizing disease-related microRNA using biological interaction networks , 2016, Briefings Bioinform..

[25]  Emily Dimmer,et al.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology , 2004, Nucleic Acids Res..

[26]  Xing Chen,et al.  LncRNADisease: a database for long-non-coding RNA-associated diseases , 2012, Nucleic Acids Res..

[27]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[28]  T. Gingeras,et al.  Genome-wide transcription and the implications for genomic organization , 2007, Nature Reviews Genetics.

[29]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[30]  Liang Cheng,et al.  Relapse-related long non-coding RNA signature to improve prognosis prediction of lung adenocarcinoma , 2016, Oncotarget.

[31]  Fan Zhang,et al.  A network medicine approach to build a comprehensive atlas for the prognosis of human cancer , 2016, Briefings Bioinform..

[32]  Ying Ju,et al.  Prediction of MicroRNA-disease Associations by Matrix Completion , 2016 .

[33]  Xing Chen,et al.  Predicting lncRNA-disease associations and constructing lncRNA functional similarity network based on the information of miRNA , 2015, Scientific Reports.

[34]  W. Kibbe,et al.  Annotating the human genome with Disease Ontology , 2009, BMC Genomics.

[35]  Mohammed H. Sqalli,et al.  UCloud: A simulated Hybrid Cloud for a university environment , 2012, 2012 IEEE 1st International Conference on Cloud Networking (CLOUDNET).

[36]  Xiangxiang Zeng,et al.  Prediction and Validation of Disease Genes Using HeteSim Scores , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[37]  Lei Yang,et al.  Identification and validation of potential prognostic lncRNA biomarkers for predicting survival in patients with multiple myeloma , 2015, Journal of Experimental & Clinical Cancer Research.

[38]  Xiangxiang Zeng,et al.  Prediction and validation of association between microRNAs and diseases by multipath methods. , 2016, Biochimica et biophysica acta.

[39]  Lei Yang,et al.  Characterization of long non-coding RNA-associated ceRNA network to reveal potential prognostic lncRNA biomarkers in human ovarian cancer , 2016, Oncotarget.

[40]  Tatiana A. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..

[41]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[42]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..