The Integration System for International Procurement Information Processing

The lack of specialties of the existing commercial web search systems stems from the fact that they have no capabilities to extract and gather the meaningful information from each information domain they cover. We are sure, however, that the necessity for the information integration system, not just search system, will be likely to become larger in the future. In this paper, we propose a design and implementation of an information integration system called TIC(target information collector). TIC is able to extract meaningful information from a specific information area in the internet and integrate them for the commercial service. We also show the evaluation results of our implementation. For the experiments we applied our TIC to the international procurement information area. The international procurement information is publicly and freely announced by each government to the world. To automatically extract common properties from the related source sites, we adopt information pointing technique using inter-HTML tag pattern parsing. And through the information integration framework design, we can easily implement a site-specific information integration engine. By running our TIC for about 8 months, we find out it can remove considerable amount of the duplicated information, and as a result, we can obtain high quality international procurement information. The main contribution of this paper is to present a framework design and it's implementation for extracting the information of a specific area and then integrating them into a meaningful one.