Parallel Learning of Weighted Association Rules in Human Phenotype Ontology

The Human Phenotype Ontology (HPO) is a standardized vocabulary of terms related to diseases. The importance and the specificity of HPO terms are estimated employing the Information Content (IC). Thus, the analysis of annotated data is a critical challenge for bioinformatics. There exist several approaches to support ontology curators in maintaining and analysing data. Among these, the use of Association Rules (AR) can improve the quality of annotations. In this paper, we present an algorithm for the parallel extraction of Weighted Association Rules (WAR) from HPO terms and annotations, able to face high dimension of data. Experiments performed on real and synthetic datasets show good speed-up and scalability.

[1]  Philip S. Yu,et al.  Efficient mining of weighted association rules (WAR) , 2000, KDD '00.

[2]  Mario Cannataro,et al.  Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Edward Y. Chang,et al.  Pfp: parallel fp-growth for query recommendation , 2008, RecSys '08.

[4]  Junzhong Gu,et al.  A New Model of Information Content for Semantic Similarity in WordNet , 2008, 2008 Second International Conference on Future Generation Communication and Networking Symposia.

[5]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2004, Nucleic Acids Res..

[6]  Russ B. Altman,et al.  Knowledge acquisition, consistency checking and concurrency control for Gene Ontology (GO) , 2003, Bioinform..

[7]  Mario Cannataro,et al.  An experimental study of information content measurement of gene ontology terms , 2018, Int. J. Mach. Learn. Cybern..

[8]  Mario Cannataro,et al.  Parallel and distributed association rule mining in life science: A novel parallel algorithm to mine genomics data , 2018, Inf. Sci..

[9]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[10]  Ada Wai-Chee Fu,et al.  Mining association rules with weighted items , 1998, Proceedings. IDEAS'98. International Database Engineering and Applications Symposium (Cat. No.98EX156).

[11]  C. Sander,et al.  The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data , 2004, Nature Biotechnology.

[12]  Mario Cannataro,et al.  Improving annotation quality in gene ontology by mining cross-ontology weighted association rules , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[13]  Susan M. Bridges,et al.  Cross-Ontology Multi-level Association Rule Mining in the Gene Ontology , 2012, PloS one.

[14]  Marcel H. Schulz,et al.  Clinical diagnostics in human genetics with semantic similarity searches in ontologies. , 2009, American journal of human genetics.

[15]  P. Robinson,et al.  The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. , 2008, American journal of human genetics.

[16]  Mario Albrecht,et al.  Mining GO Annotations for Improving Annotation Consistency , 2012, PloS one.

[17]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[18]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[19]  David Sánchez,et al.  A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain , 2014, J. Biomed. Informatics.

[20]  Wei Xu,et al.  The disease and gene annotations (DGA): an annotation resource for human disease , 2012, Nucleic Acids Res..

[21]  Jeff Z. Pan,et al.  Inconsistencies, Negations and Changes in Ontologies , 2006, AAAI.

[22]  David Sánchez,et al.  Ontology-based information content computation , 2011, Knowl. Based Syst..

[23]  Susan M. Bridges,et al.  Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships , 2013, J. Biomed. Informatics.

[24]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[25]  Mario Cannataro,et al.  Using GO-WAR for mining cross-ontology weighted association rules , 2015, Comput. Methods Programs Biomed..