The emergence of peer-to-peer networks over the last decade has changed user’s perspective about information available on the Web. But, with thousand of nodes joining and leaving a peer-to-peer network within a shor t span of t ime, it has become practically impossible for a node (or peer) to keep track of complete network. Often times, however, a node needs to at least have an estimate of number of nodes in such a network. For example, in determining time-to-live for a search query packet, a node must have a good estimate of network size. Previousdeterministic approaches require a completewalk on the network, since such networks usually lack a central author ity. Such approaches hencedo not scale well to large networks. A few approaches, which collect par tial information about the network, have been proposed as an alternative to address the scalabili ty issues. This paper presents a novel approach for size estimation of a peer-to-peer network. The basic idea is to sample nodes in the network and then using this par tial information about the network, an estimate of the network size is obtained using capture-recapture method. The capture-recapture method is a statistical method, which has been widely used for estimation of size of a closed population in oceanography and epidemiology. For a better estimate, the capture-recapture method requires two or more random (independent) samplings (sets of detected nodes) of the network. In our case, for independent sampling, we use random walks on the peer-to-peer network, since a random walk can achieve same statistical properties as independent samplings for a peer-to-peer network (see Gkantsidis et al [1]). Experimental results show that the proposed random walk based capture-recapture approach gives a good estimate of network size. In addition, results of using proposed method as well as three other size estimation methods on scale-free and random networks shows that the former algor ithm gives a better estimate (lesser error) with a slight overhead on computation. This research motivates fur ther study of estimation techniques for open networks (i.e. networks whose size changes dur ing the estimation process).
[1]
J. Darroch.
THE MULTIPLE-RECAPTURE CENSUS I. ESTIMATION OF A CLOSED POPULATION
,
1958
.
[2]
G. Seber.
The estimation of animal abundance and related parameters
,
1974
.
[3]
J. Wittes,et al.
The estimation of false negatives in medical screening.
,
1978,
Biometrics.
[4]
BERNARD M. WAXMAN,et al.
Routing of multipoint connections
,
1988,
IEEE J. Sel. Areas Commun..
[5]
Arthur L. Liestman,et al.
A survey of gossiping and broadcasting in communication networks
,
1988,
Networks.
[6]
Albert,et al.
Emergence of scaling in random networks
,
1999,
Science.
[7]
Ibrahim Matta,et al.
On the origin of power laws in Internet topologies
,
2000,
CCRV.
[8]
Richard M. Karp,et al.
Randomized rumor spreading
,
2000,
Proceedings 41st Annual Symposium on Foundations of Computer Science.
[9]
Lada A. Adamic,et al.
Search in Power-Law Networks
,
2001,
Physical review. E, Statistical, nonlinear, and soft matter physics.
[10]
László Lovász,et al.
Random Walks on Graphs: A Survey
,
1993
.
[11]
Anukool Lakhina,et al.
BRITE: Universal Topology Generation from a User''s Perspective
,
2001
.
[12]
Walter Willinger,et al.
Network topology generators: degree-based vs. structural
,
2002,
SIGCOMM '02.
[13]
Ian T. Foster,et al.
Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design
,
2002,
ArXiv.
[14]
Dahlia Malkhi,et al.
Estimating network size from local information
,
2003,
Information Processing Letters.
[15]
Mark E. J. Newman,et al.
The Structure and Function of Complex Networks
,
2003,
SIAM Rev..
[16]
M -.
Estimating Aggregates on a Peer-to-Peer Network
,
2003
.
[17]
Indranil Gupta.
Practical Algorithms for Size Estimation in Large and Dynamic Groups
,
2004
.
[18]
Christos Gkantsidis,et al.
Random walks in peer-to-peer networks
,
2004,
IEEE INFOCOM 2004.
[19]
L. Asz.
Random Walks on Graphs: a Survey
,
2022
.