Revisiting the coupon collector’s problem to unveil users’ online sessions in networked systems

Accuratecomprehension of users’ behavior is paramount for understanding the dynamics of several systems, such as e-commerce platforms, social networks, and mobile computing. To this end, several strategies have been proposed to obtain data sets based on the capture of usage information, which can then serve for user analytics. A popular strategy consists of taking periodic snapshots of online users, a practical instance of the coupon collector’s problem tailored to users monitoring in networked systems. Due to system-specific limitations, however, users may fail to appear in some snapshots, although online. To bridge this gap, we present a methodology to correct ill-collected snapshots and build more accurate data sets. In summary, we formally model user snapshotting as an instance of the coupon collector’s problem, estimate the probability that some users are missing in a given snapshot following a Bernoulli process, and correct those snapshots should the probability exceed a given threshold.

[1]  Burkhard Stiller,et al.  The BitTorrent Peer Collector Problem , 2017, 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM).

[2]  R. K. Ghosh,et al.  An integrated P2P framework for E-learning , 2020, Peer Peer Netw. Appl..

[3]  Sajal K. Das,et al.  Compressive sensing based data quality improvement for crowd-sensing applications , 2017, J. Netw. Comput. Appl..

[4]  Hui Lu,et al.  Bitcoin Network Size Estimation Based on Coupon Collection Model , 2019, ICAIS.

[5]  Dafang Zhang,et al.  Accurate Recovery of Missing Network Measurement Data With Localized Tensor Completion , 2019, IEEE/ACM Transactions on Networking.

[6]  Athanasios V. Vasilakos,et al.  Information centric network: Research challenges and opportunities , 2015, J. Netw. Comput. Appl..

[7]  Jiannong Cao,et al.  Accurate and Fast Recovery of Network Monitoring Data: A GPU Accelerated Matrix Completion , 2020, IEEE/ACM Transactions on Networking.

[8]  Gaogang Xie,et al.  Accurate recovery of Internet traffic data: A tensor completion approach , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[9]  Maximilian Michel,et al.  Characterization of BitTorrent swarms and their distribution in the Internet , 2011, Comput. Networks.

[10]  James Moody,et al.  Network sampling coverage II: The effect of non-random missing data on network measurement , 2017, Soc. Networks.

[11]  Jiannong Cao,et al.  Recover Corrupted Data in Sensor Networks: A Matrix Completion Solution , 2017, IEEE Transactions on Mobile Computing.

[12]  Gaogang Xie,et al.  Accurate recovery of internet traffic data under dynamic measurements , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[13]  Burkhard Stiller,et al.  Big torrent measurement: A country-, network-, and content-centric analysis of video sharing in BitTorrent , 2018, NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium.

[14]  Naoto Miyoshi,et al.  Data rate and handoff rate analysis for user mobility in cellular networks , 2018, 2018 IEEE Wireless Communications and Networking Conference (WCNC).

[15]  Bettahally N. Keshavamurthy,et al.  Next level peer-to-peer overlay networks under high churns: a survey , 2020, Peer-to-Peer Netw. Appl..

[16]  Di Wu,et al.  Unraveling the BitTorrent Ecosystem , 2011, IEEE Transactions on Parallel and Distributed Systems.

[17]  Reza Rejaie,et al.  Is content publishing in BitTorrent altruistic or profit-driven? , 2010, CoNEXT.

[18]  Xiaoning Ding,et al.  Measurements, analysis, and modeling of BitTorrent-like systems , 2005, IMC '05.

[19]  Akihiro Nakao,et al.  Measuring BitTorrent swarms beyond reach , 2011, 2011 IEEE International Conference on Peer-to-Peer Computing.

[20]  Rolando Martins,et al.  Reputation based approach for improved fairness and robustness in P2P protocols , 2018, Peer-to-Peer Netw. Appl..

[21]  Dmitri Loguinov,et al.  Node isolation model and age-based neighbor selection in unstructured P2P networks , 2009, TNET.

[22]  Taoufik En-Najjary,et al.  Long Term Study of Peer Behavior in the kad DHT , 2009, IEEE/ACM Transactions on Networking.

[23]  Mikel Izal,et al.  Dissecting BitTorrent: Five Months in a Torrent's Lifetime , 2004, PAM.

[24]  Diane E. Vaughan,et al.  A Survey of the Coupon Collector’s Problem with Random Sample Sizes , 2007 .

[25]  Wolfgang Stadje,et al.  THE COLLECTOR'S PROBLEM WITH GROUP DRAWINGS , 1990 .

[26]  Ted Taekyoung Kwon,et al.  Predicting content consumption from content-to-content relationships , 2019, J. Netw. Comput. Appl..

[27]  Vassilis Kostakos,et al.  Evidence-Aware Mobile Computational Offloading , 2018, IEEE Transactions on Mobile Computing.

[28]  Dave Levin,et al.  Residential links under the weather , 2019, SIGCOMM.

[29]  DaeHun Nyang,et al.  Recyclable Counter With Confinement for Real-Time Per-Flow Measurement , 2016, IEEE/ACM Transactions on Networking.

[30]  Liansheng Tan,et al.  Traffic matrix estimation: A neural network approach with extended input and expectation maximization iteration , 2016, J. Netw. Comput. Appl..

[31]  Helena Rifà-Pous,et al.  PSUM: Peer-to-peer multimedia content distribution using collusion-resistant fingerprinting , 2016, J. Netw. Comput. Appl..

[32]  Jing Wang,et al.  Online Matrix Completion for Signed Link Prediction , 2017, WSDM.

[33]  Oluwafolake E. Ojo,et al.  AyoPeer: The adapted ayo-game for minimizing free riding in peer-assisted network , 2020, Peer-to-Peer Netw. Appl..

[34]  Jiannong Cao,et al.  Accurate Recovery of Internet Traffic Data: A Sequential Tensor Completion Approach , 2018, IEEE/ACM Transactions on Networking.

[35]  Cheng-Te Li,et al.  Identifying Users behind Shared Accounts in Online Streaming Services , 2018, SIGIR.

[36]  Luciano Paschoal Gaspary,et al.  Observing the BitTorrent universe through Telescopes , 2011, 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops.

[37]  Panagiotis Symeonidis,et al.  Session-based News Recommendations , 2018, WWW.