Extending a geocoding database by Web information extraction

Local Search has recently attracted much attention. And the popular architecture of Local Search is map-and-hyperlinks, which links geo-referenced Web content to a map interface. This architecture shows that a good Local Search not only depends on search engine techniques, but also on a perfect geocoding database. The process of building and updating a geocoding database is laborious and time consuming so that it is usually difficult to keep up with the change of the real world. However, the Web provides a rich resource of location related information, which would be a supplementary information source for geocoding. Therefore, this paper introduces how to extract geographic information from Web documents to extend a geocoding database. Our approach involves two major steps. First, geographic named entities are identified and extracted from Web content. Then, named entities are geocoded and put into storage. By this way, we can extend a geocoding database to provide better local Web search services.