CBL: Exploiting Community Based Locality for Efficient Content Search Service in Online Social Networks

Retrieving relevant data for users in online social network (OSN) systems is a challenging problem. Cassandra, a storage system used by popular OSN systems, such as Facebook and Twitter, relies on a DHT-based scheme to randomly partition the personal data of users among servers across multiple data centers. Although DHT is highly scalable for hosting a large number of users (personal data), it leads to costly inter-server communications across data centers due to the complex interconnection and interaction among OSN users. In this paper, we explore how to retrieve the OSN content in a cost-effective way by retaining the simple and robust nature of OSNs. Our approach exploits a simple, yet powerful principle called Community-Based Locality (CBL), which posits that if a user has a one-hop neighbor within a particular community, it is very likely that the user has other one-hop neighbors inside the same community. We demonstrate the existence of community-based locality in diverse traces of popular OSN systems such as Facebook, Orkut, Flickr, Youtube, and Livejournal. Based on the observation, we design a CBL-based algorithm to build the content index in OSN systems. By partitioning and indexing the relevant data of users within a community on the same server in the data center, the CBL-based index avoids a significant amount of inter-server communications during searching, making retrieving relevant data for a user in large-scale OSNs efficient. In addition, by using CBL-based scheme we can provide much faster search response and balanced loads. We conduct comprehensive trace-driven simulations to evaluate the performance of the proposed scheme. Results show that ourscheme significantly reduces the network traffic by 73 percent while reduces the query latency by 35 percent compared with existing schemes.

[1]  Jimmy J. Lin,et al.  Earlybird: Real-Time Search at Twitter , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[2]  Ben Y. Zhao,et al.  User interactions in social networks and their implications , 2009, EuroSys '09.

[3]  Patrick C. K. Hung,et al.  Constructing a Global Social Service Network for Better Quality of Web Service Discovery , 2015, IEEE Transactions on Services Computing.

[4]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[5]  Damon Horowitz,et al.  The anatomy of a large-scale social search engine , 2010, WWW '10.

[6]  Hai Jin,et al.  Minimizing Inter-Server Communications by Exploiting Self-Similarity in Online Social Networks , 2012, IEEE Transactions on Parallel and Distributed Systems.

[7]  Randy H. Katz,et al.  DeTail: reducing the flow completion time tail in datacenter networks , 2012, SIGCOMM '12.

[8]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2012, TNET.

[9]  Johannes Gehrke,et al.  Search in social networks with access control , 2010, KEYS '10.

[10]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[11]  Ling Liu Service Selection and Recommendation through Collective Intelligence , 2014, Computer.

[12]  Amin Vahdat,et al.  PortLand: a scalable fault-tolerant layer 2 data center network fabric , 2009, SIGCOMM '09.

[13]  Hai Jin,et al.  Efficient Keyword Searching in Large-Scale Social Network Service , 2018, IEEE Transactions on Services Computing.

[14]  Beng Chin Ooi,et al.  TI: an efficient indexing mechanism for real-time search on tweets , 2011, SIGMOD '11.

[15]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Michael Goul,et al.  How Does Collaborative Group Technology Influence Social Network Structure? , 2008, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008).

[17]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[18]  Ling Shao,et al.  Efficient Feature Selection and Classification for Vehicle Detection , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[20]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[21]  Brighten Godfrey,et al.  Finishing flows quickly with preemptive scheduling , 2012, CCRV.

[22]  Michael Goul,et al.  The influence of collaborative technology knowledge on advice network structures , 2010, Decis. Support Syst..

[23]  Virgílio A. F. Almeida,et al.  Characterizing user behavior in online social networks , 2009, IMC '09.

[24]  David Hawking,et al.  Overview of the TREC-9 Web Track , 2000, TREC.

[25]  Xingming Sun,et al.  Segmentation-Based Image Copy-Move Forgery Detection Scheme , 2015, IEEE Transactions on Information Forensics and Security.

[26]  Xingming Sun,et al.  Achieving Efficient Cloud Search Services: Multi-Keyword Ranked Search over Encrypted Cloud Data Supporting Parallel Computing , 2015, IEICE Trans. Commun..

[27]  Johannes Gehrke,et al.  Workload-aware indexing for keyword search in social networks , 2011, CIKM '11.

[28]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[29]  Marianne Winslett,et al.  Keyword search over key-value stores , 2010, WWW '10.

[30]  Bin Gu,et al.  Feasibility and Finite Convergence Analysis for Accurate On-Line $\nu$-Support Vector Machine , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Zhongyuan Zhang,et al.  Community structure detection in social networks based on dictionary learning , 2011, Science China Information Sciences.

[32]  David A. Maltz,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM 2010.

[33]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[34]  Xingming Sun,et al.  Synthetic Aperture Radar Image Segmentation by Modified Student's t-Mixture Model , 2014, IEEE Transactions on Geoscience and Remote Sensing.