Ontology Extraction Considering Content Concordance from Tagging to Web Pages in Similar SBM Users

To realize web search engines with considering meaning of query phrases for each user, we have studied a method to extract hierarchical and synonymous relationships among tagged phrases on a social bookmark (SBM) for an individual SBM user. It detects the relationships from web page clusters with same tagged phrases derived from the bookmarks shared in the target and his similar SBM users. However, noisy tagging violating personal phrase meaning degrades its detection accuracy. This paper proposes a method to improve such drawback. The proposed method classifies web pages based on its content concordance as long as based on sameness of tagged phrases. Analyzing web pages belonging-ness to content-based and tag-based clusters, the relationships are detected more accurately. We compared the detection accuracies of the proposed and traditional methods through an experiment. For hierarchical relationships, the F-measure improves by 7.41% and the precision improves by 20.94% under guaranteeing more than 20% recall. For synonymous one, the F-measure does by 4.17% and the precision does by 21.80% under more than 10% recall.