An ontology service for geographical content

Geographic place names are widely used but are semantically often highly ambiguous. For example, there are 491 places in Finland sharing the same name “Isosaari” (great island) that are instances of several geographical classes, such as Island, Forest, Peninsula, Inhabited area, etc. Referencing unambiguously to a particular “Isosaari”, either when annotating content or during information retrieval, can be quite problematic and requires usage of advanced search methods and maps for semantic disambiguation. This paper presents an ontology server, ONKI-Paikka, for solving the place finding and place name disambiguation problem. In ONKI-Paikka, places can be found by a faceted search engine, combined with semantic autocompletion and a map service for constraining search and for visualizing results. The service can be connected to legacy applications cost-effectively by using Ajax-technology in the same spirit as Google Maps that is used in ONKI-Paikka as a subservice. 1 SUO Place Ontology ONKI-Paikka is based on the Finnish Place Ontology SUO [2] being developed as a part of the national semantic web infrastructure in Finland [1]. The SUO has been populated with 1) place information from the Geographic Names Register (GNR) provided by the National Land Survey of Finland and with 2) place information from the GEOnet Names Server (GNS) maintained by the National Geospatial-Intelligence Agency (NGA) and the U.S. Board on Geographic Names (US BGN). GNR contains about 800,000 names of natural and man-made features in Finland, including data such as place type or feature type and the coordinates of a place. The GNS register contains similar information of about 4,100,000 places around the world excluding places in the United States. 2 ONKI-Paikka Ontology Service ONKI-Paikka publishes SUO with online ontology services for humans and machines to use. The user interface utilizes Ajax-techniques for communicating with the ONKI-Paikka database server containing the place instances. The place 1 Operational at http://demo.seco.tkk.fi/onkipaikka/ 2 http://www.maanmittauslaitos.fi/ 3 http://earth-info.nga.mil/gns/html/ 4 http://ajax.org/ finder contains a simple faceted search engine for narrowing the search along the following dimensions: 1) Place name facet filters matching place instances using string autocompletion for their labels. 2) Place type facet of place types (City, Island, Cemetery, etc.) is used to focus search to places of desired types. 3) Language facet limits search to place names in given languages (Finnish, Swedish, English, and three dialects of Sami). 4) Time facet is used to focus search on historical place names. 5) Map facet allows the user to specify a polygon area in which to search for the place on the map [2]. The polygon functions as a narrowing criteria for the search in the same way as category selections in the other facets. 6) Area facet makes it possible to focus search on continents, countries, and their smaller regions. By using ONKI-Paikka one can e.g. find easily on the map all places of 1) some type with 2) names beginning with some letters, and that are 3) inside a polygon out of millions of resources. ONKI-Paikka uses Google Maps API 5 for visualization, and can be connected to and utilized in legacy systems using Ajax and mash-up techniques. In a museum cataloging system, for example, places can be found by using the ONKI-Paikka user interface and by pushing the “Select” button, the corresponding URI or coordinates are transferred from the centralized national service into the lecagy application. ONKI-Paikka is an instance of the ONKI ontology service framework being built in Finland [1]. A technical challenge with the ONKI-Paikka service is the large size of the SUO. The response times of the faceted search were several minutes when using a straight-forward Jena implementation. To solve the efficiency problem, only the SUO ontology classes, without the instances, are stored in a Jena model, and a separate indexing layer was created for the millions of instances. The indexing of the instances is created in a relational database (MySQL) making complex combinational searches possible and relatively fast.