TESIS DOCTORAL E XPERIMENTAL ANALYSIS OF THE SOCIO-ECONOMIC PHENOMENA IN THE BITTORRENT ECOSYSTEM

BitTorrent is the most successful Peer-to-Peer (P2P) application and is responsible for a major portion of Internet traffic. It has been largely studied using simulations, models and real measurements. Although simulations and modelling are easier to perform, they typically simplify analysed problems and in case of BitTorrent they are likely to miss some of the effects which occur in real swarms. Thus, in this thesis we rely on real measurements. In the first part of the thesis we present the summary of measurement techniques used so far and we use it as a base to design our tools that allow us to perform different types of analysis at different resolution level. Using these tools we collect several large-scale datasets to study different aspects of BitTorrent with a special focus on socio-economic aspects. Using our datasets, we first investigate the topology of real BitTorrent swarms and how the traffic is actually exchanged among peers. Our analysis shows that the resilience of BitTorrent swarms is lower than corresponding random graphs. We also observe that ISP policies, locality-aware clients and network events (e.g., network congestion) lead to locality-biased composition of neighbourhood in the swarms. This means that the peer contains more neighbours from local provider than expected from purely random neighbours selection process. Those results are of interest to the companies which use BitTorrent for daily operations as well as for ISPs which carry BitTorrent traffic. In the next part of the thesis we look at the BitTorrent from the perspective of the content and content publishers in a major BitTorrent portals. We focus on the factors that seem to drive the popularity of the BitTorrent and, as a result, could affect its associated traffic in the Internet. We show that a small fraction of publishers (around 100 users) is responsible for more than two-thirds of the published content. Those publishers can be divided into two groups: (i) profit driven and (ii) fake publishers. The former group leverages the published copyrighted content (typically very popular) on BitTorrent portals to attract content consumers to their web sites for financial gain. Removing this group may have a significant impact on the popularity of BitTorrent portals and, as a result, may affect a big portion of the Inter-

[1]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[2]  Walid Dabbous,et al.  Spying the World from Your Laptop: Identifying and Profiling Content Providers and Big Downloaders in BitTorrent , 2010, LEET.

[3]  V Latora,et al.  Efficient behavior of small-world networks. , 2001, Physical review letters.

[4]  Joseph G. Peters,et al.  Deterministic small-world communication networks , 2000, Inf. Process. Lett..

[5]  Bin Li,et al.  Content Availability and Bundling in Swarming Systems , 2009, IEEE/ACM Transactions on Networking.

[6]  Arun Venkataramani,et al.  Availability in BitTorrent Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[7]  Keith W. Ross,et al.  BitTorrent Darknets , 2010, 2010 Proceedings IEEE INFOCOM.

[8]  B. Cohen,et al.  Incentives Build Robustness in Bit-Torrent , 2003 .

[9]  Steve Chien,et al.  A First Look at Peer-to-Peer Worms: Threats and Defenses , 2005, IPTPS.

[10]  Christian Scheideler,et al.  Can ISPS and P2P users cooperate for improved performance? , 2007, CCRV.

[11]  Xiaoning Ding,et al.  Measurements, analysis, and modeling of BitTorrent-like systems , 2005, IMC '05.

[12]  William Chan,et al.  Improving Traffic Locality in BitTorrent via Biased Neighbor Selection , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[13]  Nael B. Abu-Ghazaleh,et al.  GPS: a general peer-to-peer simulator and its use for modeling BitTorrent , 2005, 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[14]  Arturo Azcorra,et al.  TorrentGuard: Stopping scam and malware distribution in the BitTorrent ecosystem , 2014, Comput. Networks.

[15]  Arun Venkataramani,et al.  Do Incentives Build Robustness in BitTorrent? (Awarded Best Student Paper) , 2007, NSDI.

[16]  Thomas E. Anderson,et al.  Pitfalls for ISP-friendly P2P design , 2009, HotNets.

[17]  Keith W. Ross,et al.  Understanding and Improving Ratio Incentives in Private Communities , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[18]  Abraham Silberschatz,et al.  P4p: provider portal for applications , 2008, SIGCOMM '08.

[19]  Rayadurgam Srikant,et al.  Modeling and performance analysis of BitTorrent-like peer-to-peer networks , 2004, SIGCOMM 2004.

[20]  David Choffnes,et al.  On blind mice and the elephant , 2011, SIGCOMM 2011.

[21]  Dan S. Wallach,et al.  An Analysis of BitTorrent’s Two Kademlia-Based DHTs , 2007 .

[22]  Chen-Nee Chuah,et al.  BASS: BitTorrent Assisted Streaming System for Video-on-Demand , 2005, 2005 IEEE 7th Workshop on Multimedia Signal Processing.

[23]  Kam-Wing Ng,et al.  Analyzing Multiple File Downloading in BitTorrent , 2006, 2006 International Conference on Parallel Processing (ICPP'06).

[24]  Venkata N. Padmanabhan,et al.  Analyzing and Improving a BitTorrent Networks Performance Mechanisms , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[25]  Niklas Carlsson,et al.  Dynamic swarm management for improved BitTorrent performance , 2009, IPTPS.

[26]  Bobby Bhattacharjee,et al.  Bittorrent is an auction: analyzing and improving bittorrent's incentives , 2008, SIGCOMM '08.

[27]  Xiaodong Zhang,et al.  ASAP: an AS-Aware Peer-Relay Protocol for High Quality VoIP , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[28]  Balachander Krishnamurthy,et al.  On the leakage of personally identifiable information via online social networks , 2010, Comput. Commun. Rev..

[29]  Ralf Steinmetz,et al.  Unraveling BitTorrent's File Unavailability: Measurements, Analysis and Solution Exploration , 2009, ArXiv.

[30]  David Hales,et al.  BitTorrent or BitCrunch: Evidence of a Credit Squeeze in BitTorrent? , 2009, 2009 18th IEEE International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises.

[31]  Keith W. Ross,et al.  Measurement and mitigation of BitTorrent leecher attacks , 2009, Comput. Commun..

[32]  Walid Dabbous,et al.  Pushing BitTorrent locality to the limit , 2008, Comput. Networks.

[33]  Eddie Kohler,et al.  Clustering and sharing incentives in BitTorrent systems , 2006, SIGMETRICS '07.

[34]  Pablo Rodriguez,et al.  Monitoring the Bittorrent Monitors: A Bird's Eye View , 2009, PAM.

[35]  Bo Li,et al.  Evolution and Enhancement of BitTorrent Network Topologies , 2008, 2008 16th Interntional Workshop on Quality of Service.

[36]  Marcel Dischinger,et al.  Detecting bittorrent blocking , 2008, IMC '08.

[37]  Michael Sirivianos,et al.  Free-riding in BitTorrent Networks with the Large View Exploit , 2007, IPTPS.

[38]  Yung Ryn Choe,et al.  Improving VoD server efficiency with bittorrent , 2007, ACM Multimedia.

[39]  P. Van Remoortere Computer software and applications.: IEEE Publications: Proceedings of the conference on computer software and applications, held November 8–11, 1977 in Chicago, IL. 834 pages, US $ 25.00 - 77CH1291-4C. , 1979 .

[40]  Jun Murai,et al.  A temporal view of the topology of dynamic Bittorrent swarms , 2011, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[41]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[42]  Vishal Misra,et al.  Improving BitTorrent: a simple approach , 2008, IPTPS.

[43]  Michalis Faloutsos,et al.  BiToS: Enhancing BitTorrent for Supporting Streaming Applications , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[44]  Walter Willinger,et al.  An empirical approach to modeling inter-AS traffic matrices , 2005, IMC '05.

[45]  Reza Rejaie,et al.  Is content publishing in BitTorrent altruistic or profit-driven? , 2010, CoNEXT.

[46]  Reza Rejaie,et al.  Unveiling the Incentives for Content Publishing in Popular BitTorrent Portals , 2013, IEEE/ACM Transactions on Networking.

[47]  Arturo Azcorra,et al.  Measuring the bittorrent ecosystem: Techniques, tips, and tricks , 2011, IEEE Communications Magazine.

[48]  Mustaque Ahamad,et al.  Incentives in BitTorrent induce free riding , 2005, P2PECON '05.

[49]  Johan A. Pouwelse,et al.  The Bittorrent P2P File-Sharing System: Measurements and Analysis , 2005, IPTPS.

[50]  Bin Fan,et al.  The Delicate Tradeoffs in BitTorrent-like File Sharing Protocol Design , 2006, Proceedings of the 2006 IEEE International Conference on Network Protocols.

[51]  Jehan-François Pâris,et al.  Peer-to-Peer Multimedia Streaming Using BitTorrent , 2007, 2007 IEEE International Performance, Computing, and Communications Conference.

[52]  Hari Balakrishnan,et al.  Malware prevalence in the KaZaA file-sharing network , 2006, IMC '06.

[53]  Nazareno Andrade,et al.  Influences on cooperation in BitTorrent communities , 2005, P2PECON '05.

[54]  Rakesh Kumar,et al.  Pollution in P2P file sharing systems , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[55]  Thomas E. Anderson,et al.  Leveraging BitTorrent for End Host Measurements , 2007, PAM.

[56]  Biplab Sikdar,et al.  Modeling seed scheduling strategies in BitTorrent , 2007 .

[57]  Minghong Lin,et al.  An ISP-Friendly File Distribution Protocol: Analysis, Design, and Implementation , 2010, IEEE Transactions on Parallel and Distributed Systems.

[58]  Keith W. Ross,et al.  Understanding Peer Exchange in BitTorrent Systems , 2010, 2010 IEEE Tenth International Conference on Peer-to-Peer Computing (P2P).

[59]  B. Levine,et al.  Exploring the Use of BitTorrent as the Basis for a Large Trace Repository , 2004 .

[60]  Venkata N. Padmanabhan,et al.  Some observations on bitTorrent performance , 2005, SIGMETRICS '05.

[61]  Keith W. Ross,et al.  The Index Poisoning Attack in P2P File Sharing Systems , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[62]  Nikolaos Laoutaris,et al.  Uplink allocation beyond choke/unchoke: or how to divide and conquer best , 2008, CoNEXT '08.

[63]  Chadi Barakat,et al.  Understanding the Properties of the BitTorrent Overlay , 2007, ArXiv.

[64]  Arturo Azcorra,et al.  Unrevealing the structure of live BitTorrent swarms: Methodology and analysis , 2011, 2011 IEEE International Conference on Peer-to-Peer Computing.

[65]  Fabián E. Bustamante,et al.  Taming the torrent: a practical approach to reducing cross-isp traffic in peer-to-peer systems , 2008, SIGCOMM '08.

[66]  Maximilian Michel,et al.  Characterization of BitTorrent swarms and their distribution in the Internet , 2011, Comput. Networks.

[67]  Pablo Rodriguez,et al.  Deep diving into BitTorrent locality , 2011, INFOCOM.

[68]  Laurent Massoulié,et al.  ISP Friend or Foe? Making P2P Live Streaming ISP-Aware , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.

[69]  Di Wu,et al.  Unraveling the BitTorrent Ecosystem , 2011, IEEE Transactions on Parallel and Distributed Systems.

[70]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[71]  Eddie Kohler,et al.  Exploring the robustness of BitTorrent peer‐to‐peer content distribution systems , 2008, Concurr. Comput. Pract. Exp..

[72]  Steven D. Gribble,et al.  A Crawler-based Study of Spyware in the Web , 2006, NDSS.

[73]  Mikel Izal,et al.  Dissecting BitTorrent: Five Months in a Torrent's Lifetime , 2004, PAM.

[74]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[75]  Guillaume Urvoy-Keller,et al.  Impact of Inner Parameters and Overlay Structure on the Performance of BitTorrent , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[76]  Minaxi Gupta,et al.  A study of malware in peer-to-peer networks , 2006, IMC '06.

[77]  David Choffnes,et al.  Strange Bedfellows: Communities in BitTorrent , 2010, International Workshop on Peer-to-Peer Systems.

[78]  Guillaume Urvoy-Keller,et al.  Rarest first and choke algorithms are enough , 2006, IMC '06.

[79]  Keith W. Ross,et al.  A Measurement Study of Attacks on BitTorrent Seeds , 2011, 2011 IEEE International Conference on Communications (ICC).

[80]  Stefan Schmid,et al.  Free Riding in BitTorrent is Cheap , 2006, HotNets.

[81]  Cheng Huang,et al.  Can internet video-on-demand be profitable? , 2007, SIGCOMM '07.

[82]  Laurent Massoulié,et al.  Faithfulness in internet algorithms , 2004, PINS '04.

[83]  Pablo Rodriguez,et al.  Should internet service providers fear peer-assisted content distribution? , 2005, IMC '05.

[84]  Jason Nieh,et al.  FairTorrent: bringing fairness to peer-to-peer systems , 2009, CoNEXT '09.

[85]  Eddie Kohler,et al.  Exploiting BitTorrent For Fun , 2006, IPTPS.

[86]  Paul Erdös,et al.  On random graphs, I , 1959 .

[87]  Mihaela van der Schaar,et al.  Reinforcement learning in BitTorrent systems , 2010, 2011 Proceedings IEEE INFOCOM.