Extraction of Geographical Attribute-Values in Natural Language Text

Natural language text which contains rich geographical knowledge is an important spatial information source for GIS. Automatic extraction of geographical attribute-values from unstructured text can not only enrich GIS information sources, but also enhance its expression capabilities and intelligibilities. Here a machine learning method was proposed to extract geographical attributes from text based on attribute keywords and rule database-driven. Firstly, based on bootstrapping method a geographical attribute dictionary was present, and geographical attribute syntactic rules were constructed from artificial induction. With the regular expression match, not only geographical attribute names and attribute values, but also the relevance of geographical entities and geographical attribute-values were extracted. Finally, with an experiment the commonly used geographical attribute names and the attribute-value extraction result are illustrated. The experiment results are able to achieve more than 85% precision and recall for geographical attribute names and values extraction, more than 75% precision and recall for the relevancy extraction.

[1]  Rayid Ghani,et al.  Text mining for product attribute extraction , 2006, SKDD.

[2]  Sujith Ravi,et al.  Using structured text for large-scale attribute extraction , 2008, CIKM '08.

[3]  Sonia Bergamaschi,et al.  Extracting Relevant Attribute Values for Improved Search , 2007, IEEE Internet Computing.

[4]  Gary Geunbae Lee,et al.  A Bootstrapping Approach for Geographic Named Entity Annotation , 2004, AIRS.

[5]  Rayid Ghani,et al.  Extracting and Using Attribute-Value Pairs from Product Descriptions on the Web , 2006, WebMine.

[6]  Tie-Yan Liu,et al.  Information Retrieval Technology , 2014, Lecture Notes in Computer Science.

[7]  David M. Mark,et al.  Natural-Language Spatial Relations Between Linear and Areal Objects: The Topology and Metric of English-Language Terms , 1998, Int. J. Geogr. Inf. Sci..

[8]  Zhifang Sui,et al.  To extract Ontology attribute value automatically based on WWW , 2008, 2008 International Conference on Natural Language Processing and Knowledge Engineering.

[9]  Xueying Zhang,et al.  Annotation of Spatial Relations in Natural Language , 2009, 2009 International Conference on Environmental Science and Information Application Technology.

[10]  Michael F. Goodchild,et al.  Citizens as Voluntary Sensors: Spatial Data Infrastructure in the World of Web 2.0 , 2007, Int. J. Spatial Data Infrastructures Res..

[11]  Katsumi Tanaka,et al.  Temporal and Spatial Attribute Extraction from Web Documents and Time-Specific Regional Web Search System , 2004, W2GIS.