An Algorithm of Street-level Landmark Obtaining Based on Yellow Pages

Street-level landmarks are the important foundation for achieving the high-precision geolocation of target IPs. Considering that yellow pages contain a large number of Web and Email domain names corresponding to institutions; the content is stable; and the format is fixed, this paper proposes a street-level landmark obtaining algorithm based on yellow pages. The domain names of institutions in yellow pages are extracted by using regular expression, and the corresponding IPs are parsed. Landmarks are screened according to whether an IP attribution is consistent with the cities where all possible corresponding institutions are located. By using the SLG geolocation algorithm, the landmarks with a geolocation error falling within the evaluation threshold are rated as reliable landmarks. The experimental results show that the proposed algorithm can effectively correct the mis-deletion and mis-evaluation of some landmarks by the existing typical landmark obtaining algorithm: based on 10 Chinese yellow pages (about 2 million institutions in 5 cities) and 3 American yellow pages (about 1 million institutions in 3 cities), a total of 55,960 reliable street-level landmarks for Web and Email servers are obtained. Among the 346,753 Web server IP evaluations, 48,361 landmarks are revised and 40,753 reliable street-level landmarks are augmented compared with the Web-Based landmark obtaining algorithm.

[1]  Helen J. Wang,et al.  Mining the Web and the Internet for Accurate IP Address Geolocations , 2009, IEEE INFOCOM 2009.

[2]  Aleksandar Pejic,et al.  Uses of W3C's Geolocation API , 2010, 2010 11th International Symposium on Computational Intelligence and Informatics (CINTI).

[3]  Emin Gün Sirer,et al.  Octant: A Comprehensive Framework for the Geolocalization of Internet Hosts , 2007, NSDI.

[4]  John S. Heidemann,et al.  Assessing co-locality of IP blocks , 2016, 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[5]  Lakshminarayanan Subramanian,et al.  An investigation of geographic mapping techniques for internet hosts , 2001, SIGCOMM 2001.

[6]  Fan Zhao,et al.  A SC-Vivaldi Network Coordinate System based Method for IP Geolocation , 2016 .

[7]  Hao Jiang,et al.  IP geolocation estimation using neural networks with stable landmarks , 2016, 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[8]  Yong Gan,et al.  A localization and tracking scheme for target gangs based on big data of Wi-Fi locations , 2018, Cluster Computing.

[9]  Fan Zhao,et al.  A Landmark Calibration Based IP Geolocation Approach , 2015, 2015 10th International Conference on Availability, Reliability and Security.

[10]  Wan Jinxia,et al.  IP Geolocation Technology Research Based on Network Measurement , 2016, 2016 Sixth International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC).

[11]  Aleksandar Kuzmanovic,et al.  Towards Street-Level Client-Independent IP Geolocation , 2011, NSDI.

[12]  Gabi Dreo Rodosek,et al.  Using Geolocation for the Strategic Preincident Preparation of an IT Forensics Analysis , 2016, IEEE Systems Journal.

[13]  L. El Ghaoui,et al.  Convex position estimation in wireless sensor networks , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[14]  K. K. Ramakrishnan,et al.  Mining checkins from location-sharing services for client-independent IP geolocation , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[15]  Larry L. Peterson,et al.  Using PlanetLab for network research: myths, realities, and best practices , 2005, OPSR.

[16]  Paul C. van Oorschot,et al.  Internet geolocation: Evasion and counterevasion , 2009, CSUR.

[17]  Fenlin Liu,et al.  An Algorithm of City-Level Landmark Mining Based on Internet Forum , 2015, 2015 18th International Conference on Network-Based Information Systems.

[18]  David Wetherall,et al.  Towards IP geolocation using delay and topology measurements , 2006, IMC '06.

[19]  Serge Fdida,et al.  Constraint-Based Geolocation of Internet Hosts , 2004, IEEE/ACM Transactions on Networking.