A measure of semantic similarity between gene ontology terms based on semantic pathway covering

Abstract Semantic similarity between Gene Ontology (GO) terms is critical in resolving semantic heterogeneousness when integrating heterogeneous biological database. Traditionally, distance based and information content based measures are two major methods. In this paper, a new method based on semantic pathway is proposed and an algorithm, COMBINE algorithm, is presented, which considers information contents of two given nodes and those of all nodes included in the two nodes pathways. Experiments show that COMBINE algorithm obtains the highest correlation index compared with those distance based and information contents based algorithms. * Supported by the National Hi-Tech Research and Development Program of China (Grant No. 2002AA231011) and the major Project of Shanghai Commission of Science & Technology (Grant No. 02DJ14013)