A simplified linear feature matching method using decision tree analysis, weighted linear directional mean, and topological relationships

ABSTRACT Linear feature matching is one of the crucial components for data conflation that sees its usefulness in updating existing data through the integration of newer data and in evaluating data accuracy. This article presents a simplified linear feature matching method to conflate historical and current road data. To measure the similarity, the shorter line median Hausdorff distance (SMHD), the absolute value of cosine similarity (aCS) of the weighted linear directional mean values, and topological relationships are adopted. The decision tree analysis is employed to derive thresholds for the SMHD and the aCS. To demonstrate the usefulness of the simple linear feature matching method, four models with incremental configurations are designed and tested: (1) Model 1: one-to-one matching based on the SMHD; (2) Model 2: matching with only the SMHD threshold; (3) Model 3: matching with the SMHD and the aCS thresholds; and (4) Model 4: matching with the SMHD, the aCS, and topological relationships. These experiments suggest that Model 2, which considers only distance, does not provide stable results, while Models 3 and 4, which consider direction and topological relationships, produce stable results with levels of accuracy around 90% and 95%, respectively. The results suggest that the proposed method is simple yet robust for linear feature matching.

[1]  Volker Walter,et al.  Matching spatial data sets: a statistical approach , 1999, Int. J. Geogr. Inf. Sci..

[2]  Michael F. Goodchild,et al.  A statistical simulation model for positional error of line features in Geographic Information Systems (GIS) , 2013, Int. J. Appl. Earth Obs. Geoinformation.

[3]  Alan Saalfeld,et al.  Conflation Automated map compilation , 1988, Int. J. Geogr. Inf. Sci..

[4]  Alberto Giordano,et al.  Positional accuracy, positional uncertainty, and feature change detection in historical maps: Results of an experiment , 2011, Comput. Environ. Urban Syst..

[5]  Peiyao Zhang,et al.  Addressing quality issues of historical GIS data: an example of Republican Beijing , 2012, Ann. GIS.

[6]  Soe-tsyr Yuan,et al.  Development of Conflation Components , 1999 .

[7]  Xiaohua Tong,et al.  A linear road object matching method for conflation based on optimization and logistic regression , 2014, Int. J. Geogr. Inf. Sci..

[8]  J. T. Hastings,et al.  Automated conflation of digital gazetteer data , 2008, Int. J. Geogr. Inf. Sci..

[9]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[10]  Catriel Beeri,et al.  Location‐based algorithms for finding sets of corresponding objects over several geo‐spatial data sets , 2010, Int. J. Geogr. Inf. Sci..

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  Bisheng Yang,et al.  A probabilistic relaxation approach for matching road networks , 2013, Int. J. Geogr. Inf. Sci..

[13]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[14]  Ashok Samal,et al.  A feature-based approach to conflation of geospatial sources , 2004, Int. J. Geogr. Inf. Sci..

[15]  Charles G. O'Hara,et al.  Quality assessment of linear data , 2009, Int. J. Geogr. Inf. Sci..

[16]  Meng Zhang,et al.  An iterative road-matching approach for the integration of postal data , 2007, Comput. Environ. Urban Syst..

[17]  Tim R. McVicar,et al.  Experimental evaluation of positional accuracy estimates from a linear network using point- and line-based testing methods , 2002, Int. J. Geogr. Inf. Sci..

[18]  Allison Kealy,et al.  Using Topological Relationships to Inform a Data Integration Process , 2008, Trans. GIS.

[19]  Timothy F. Trainor U.S. Census Bureau Geographic Support: A Response to Changing Technology and Improved Data , 2003 .

[20]  M. Goodchild,et al.  An optimisation model for linear feature matching in geographical data conflation , 2011 .

[21]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[22]  Consolación Gil,et al.  Pareto-based evolutionary algorithms for the calculation of transformation parameters and accuracy assessment of historical maps , 2013, Comput. Geosci..

[23]  Wsd Wong,et al.  Statistical Analysis of Geographic Information with ArcView GIS And ArcGIS , 2005 .

[24]  Bisheng Yang,et al.  A Pattern-Based Approach for Matching Nodes in Heterogeneous Urban Road Networks , 2014, Trans. GIS.

[25]  Catriel Beeri,et al.  Object Fusion in Geographic Information Systems , 2004, VLDB.

[26]  Glen Hart,et al.  Geospatial Information Integration for Authoritative and Crowd Sourced Road Vector Data , 2012, Trans. GIS.