Practice of Constructing Name Authority Database Based on Multi-Source Data Integration

Name authority is a common issue in digital library. This paper mainly summarizes the practice of constructing name authority database based on multi-source data in NSTL. Firstly, we load, integrate different source data and convert them into unified structure. Then, we extract scientific entities and relationships from these data, according to metadata model. For different entities, we use different disambiguation rules and algorithms. As a result, we have constructed author name authority database, institution authority database, journal authority database, and fund authority database. Compared with Incites, taking six institutions name authority data as a sample, the result shows that the average accuracy can reach 86.8%