The role of network structure has grown in significance over the past ten years in the field of information retrieval, stimulated to a great extent by the importance of link analysis in the development of Web search techniques [4]. This body of work has focused primarily on the network that is most clearly visible on the Web: the network of hyperlinks connecting documents to documents. But the Web has always contained a second network, less explicit but equally important, and this is the social network on its users, with latent person-to-person links encoding a variety of relationships including friendship, information exchange, and influence. Developments over the past few years --- including the emergence of social networking systems and rich social media, as well as the availability of large-scale e-mail and instant messenging datasets --- have highlighted the crucial role played by on-line social networks, and at the same time have made them much easier to uncover and analyze. There is now a considerable opportunity to exploit the information content inherent in these networks, and this prospect raises a number of interesting research challenge.Within this context, we focus on some recent efforts to formalize the problem of searching a social network. The goal is to capture the issues underlying a variety of related scenarios: a member of a social networking system such as MySpace seeks a piece of information that may be held by a friend of a friend [27, 28]; an employee in a large company searches his or her network of colleagues for expertise in a particular subject [9]; a node in a decentralized peer-to-peer file-sharing system queries for a file that is likely to be a small number of hops away [2, 6, 16, 17]; or a user in a distributed IR or federated search setting traverses a network of distributed resources connected by links that may not just be informational but also economic or contractual [3, 5, 7, 8, 13, 18, 21]. In their most basic forms, these scenarios have some essential features in common: a node in a network, without global knowledge, must find a short path to a desired "target" node (or to one of several possible target nodes).To frame the underlying problem, we go back to one of the most well-known pieces of empirical social network analysis --- Stanley Milgram's research into the small-world phenomenon, also known as the "six degrees of separation" [19, 24, 25]. The form of Milgram's experiments, in which randomly chosen starters had to forward a letter to a designated target individual, established not just that short chains connecting far-flung pairs of people are abundant in large social networks, but also that the individuals in these networks, operating with purely local information about their own friends and acquaintances, are able to actually find these chains [10]. The Milgram experiments thus constituted perhaps the earliest indication that large-scale social networks are structured to support this type of decentralized search. Within a family of random-graph models proposed by Watts and Strogatz [26], we have shown that the ability of a network to support this type of decentralized search depends in subtle ways on how its "long-range" connections are correlated with the underlying spatial or organizational structure in which it is embedded [10, 11]. Recent studies using data on communication within organizations [1] and the friendships within large on-line communities [15] have established the striking fact that real social networks closely match some of the structural features predicted by these mathematical models.If one looks further at the on-line settings that provide the initial motivation for these issues, there is clearly interest from many directions in their long-term economic implications --- essentially, the consequences that follow from viewing distributed information retrieval applications, peer-to-peer systems, or social-networking sites as providing marketplaces for information and services. How does the problem of decentralized search in a network change when the participants are not simply agents following a fixed algorithm, but strategic actors who make decisions in their own self-interest, and may demand compensation for taking part in a protocol? Such considerations bring us into the realm of algorithmic game theory, an active area of current research that uses game-theoretic notions to quantify the performance of systems in which the participants follow their own self-interest [20, 23] In a simple model for decentralized search in the presence of incentives, we find that performance depends crucially on both the rarity of the information and the richness of the network topology [12] --- if the network is too structurally impoverished, an enormous investment may be required to produce a path from a query to an answer.
[1]
Nick Craswell,et al.
Methods for Distributed Information Retrieval
,
2000
.
[2]
Bin Yu,et al.
An incentive mechanism for message relaying in unstructured peer-to-peer systems
,
2007,
AAMAS '07.
[3]
Edward A. Fox,et al.
Harvesting: Broadening the Field of Distributed Information Retrieval
,
2003,
Distributed Multimedia Information Retrieval.
[4]
Lada A. Adamic,et al.
How to search a social network
,
2005,
Soc. Networks.
[5]
Bart Selman,et al.
Referral Web: combining social networks and collaborative filtering
,
1997,
CACM.
[6]
Hector Garcia-Molina,et al.
Routing indices for peer-to-peer systems
,
2002,
Proceedings 22nd International Conference on Distributed Computing Systems.
[7]
Éva Tardos,et al.
Network games
,
2004,
STOC '04.
[8]
Jon Crowcroft,et al.
A survey and comparison of peer-to-peer overlay network schemes
,
2005,
IEEE Communications Surveys & Tutorials.
[9]
Christos H. Papadimitriou,et al.
Algorithms, games, and the internet
,
2001,
STOC '01.
[10]
Chaomei Chen,et al.
Mining the Web: Discovering knowledge from hypertext data
,
2004,
J. Assoc. Inf. Sci. Technol..
[11]
Jamie Callan,et al.
DISTRIBUTED INFORMATION RETRIEVAL
,
2002
.
[12]
Luo Si,et al.
Modeling search engine effectiveness for federated search
,
2005,
SIGIR '05.
[13]
K. Sycara,et al.
An incentive mechanism for message relaying in peer-to-peer discovery
,
2004
.
[14]
Sharon L. Milgram,et al.
The Small World Problem
,
1967
.
[15]
P. Gács,et al.
Algorithms
,
1992
.
[16]
David D. Jensen,et al.
Decentralized Search in Networks Using Homophily and Degree Disparity
,
2005,
IJCAI.
[17]
Duncan J. Watts,et al.
Six Degrees: The Science of a Connected Age
,
2003
.
[18]
Herbert Van de Sompel,et al.
The open archives initiative: building a low-barrier interoperability framework
,
2001,
JCDL '01.
[19]
James Aspnes,et al.
Distributed Data Structures for Peer-to-Peer Systems
,
2005,
Handbook on Theoretical and Algorithmic Aspects of Sensor, Ad Hoc Wireless, and Peer-to-Peer Networks.
[20]
Jun Zhang,et al.
SWIM: fostering social network based information search
,
2004,
CHI EA '04.
[21]
Prabhakar Raghavan.
Query Incentive Networks
,
2005,
ASIAN.
[22]
Stanley Milgram,et al.
An Experimental Study of the Small World Problem
,
1969
.
[23]
King-Lup Liu,et al.
Building efficient and effective metasearch engines
,
2002,
CSUR.
[24]
Luis Gravano,et al.
QProber: A system for automatic classification of hidden-Web databases
,
2003,
TOIS.
[25]
Munindar P. Singh,et al.
Searching social networks
,
2003,
AAMAS '03.
[26]
Duncan J. Watts,et al.
Collective dynamics of ‘small-world’ networks
,
1998,
Nature.
[27]
Jasmine Novak,et al.
Geographic routing in social networks
,
2005,
Proc. Natl. Acad. Sci. USA.
[28]
Jie Lu,et al.
Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks
,
2005,
Workshop on Peer-to-Peer Information Retrieval.
[29]
Jon M. Kleinberg,et al.
The small-world phenomenon: an algorithmic perspective
,
2000,
STOC '00.