Inferring relative popularity of internet applications by actively querying DNS caches

In this work, we propose a novel methodology that can be used to assess the relative popularity for any Internet application based on the data servers it uses. The basic idea is to infer popularity of data servers by periodically "poking" at local Domain Name servers (LDNSs) that service Domain Name System requests from a set of users running Internet applications and determining if LDNSs have cached resource records for the data servers. This approach allows us to measure the relative percentage of pokes that result in a cache hit as a coarse measure of the relative popularity of a particular data server among the users of a given LDNS. In addition, the time-to-live (TTL) of cached DNS resource records can be used to measure the gaps in time when a resource record for a data server is not cached. The cache gaps can be used to infer request interarrivals for more popular data servers.The methodology can be applied to any Internet application that uses distinguished server names and performs DNS lookups on these names as part of application use. The methodology can be used to collect usage information from any LDNS that accepts DNS queries. As example applications of the methodology, we evaluate the relative popularity of selected Web sites and the relative popularity of different Web servers serving content at a given Web site. We also apply the methodology to servers providing multimedia content, data servers for grid computing, and network game servers. We use data gathered from LDNSs of commercial and educational sites as well as Internet Service Providers serving both commercial and home customers.

[1]  Mark Claypool,et al.  Characteristics of streaming media stored on the Web , 2005, TOIT.

[2]  Paul V. Mockapetris,et al.  Domain names: Concepts and facilities , 1983, RFC.

[3]  Balachander Krishnamurthy,et al.  Characterizing large DNS traces using graphs , 2001, IMW '01.

[4]  Craig E. Wills,et al.  The Contribution of DNS Lookup Costs to Web Object Retrieval , 2000 .

[5]  Robert Tappan Morris,et al.  DNS performance and the effectiveness of caching , 2001, IMW '01.

[6]  Craig E. Wills,et al.  Evaluating a new approach to strong web cache consistency with snapshots of collected content , 2003, WWW '03.

[7]  Evi Nemeth,et al.  DNS measurements at a root server , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[8]  Peter B. Danzig,et al.  An analysis of wide-area name server traffic: a study of the Internet Domain Name System , 1992, SIGCOMM '92.

[9]  Carl Sagan,et al.  The Search for Extraterrestrial Intelligence , 1975 .

[10]  Anees Shaikh,et al.  On the effectiveness of DNS-based server selection , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[11]  Edith Cohen,et al.  Proactive caching of DNS records: addressing a performance bottleneck , 2001, Proceedings 2001 Symposium on Applications and the Internet.

[12]  Paul V. Mockapetris,et al.  Domain names - implementation and specification , 1987, RFC.