Ontology-based Integration and Refinement of Evaluation-Committee Data from Heterogeneous Data Sources

Korean National Science and Technology Information Service (NTIS) provide a service of searching national R&D projects and their participating researcher information. It also provides a service of recommending and selecting evaluation committees for the R&D projects. Such R&D data and information are collected from 17 Korean government ministries and agencies and integrated into NTIS. Therefore, the duplicates of the R&D data and researcher information can be inserted because the titles of a researcher's R&D accomplishment data can be differently inserted from the different organizations. Furthermore, the names of researchers and other related objects such as organizations and journals can also be inserted vairously as the names have various aliases in general. In this research, we present an ontology-based data integration and refinement system for integrating such researcher information and their R&D accomplishment data, which would be useful for the recommendation and selection services. Also, we also used Jaro-Winkler distance algorithm to find and eliminate the duplicated accomplishment data. Furthermore, incorrectly entered data are also corrected from the duplicate elimination process with the information obtained from some authoritative science libraries.