T-PriDO: A Tree-based Privacy-Preserving and Contextual Collaborative Online Big Data Processing System

The emerging of big data era has urged the development of online big data processing systems. However, existing works seldom take the issue of privacy into account. In addition, the impacts of network structures among service providers are usually ignored. In this paper, we propose a cloud-based big data processing framework, where service providers are modeled as distributed cooperative learners predicting users’ preferences of items based on users’ contexts, while adapting the decision-making strategy based on users’ reward. We establish an item-cluster tree from top to the bottom to handle big data analysis. Service providers share information over a social network to complete collaborative learning. Considering the structure of the social networks among service providers, we also propose an adaptive algorithm to reduce the performance loss. Theoretical analysis shows that our proposal achieves sublinear regret and differential privacy of both network service providers and users. Experiments results validate that our proposed algorithms support increasing big datasets while strike a balance between privacy-preserving level and prediction accuracy.

[1]  Yan Zhang,et al.  RescueDP: Real-time spatio-temporal crowd-sourced data publishing with differential privacy , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[2]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[3]  Mihaela van der Schaar,et al.  Online Learning in Large-Scale Contextual Recommender Systems , 2016, IEEE Transactions on Services Computing.

[4]  Elaine Shi,et al.  Private and Continual Release of Statistics , 2010, ICALP.

[5]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[6]  Csaba Szepesvári,et al.  –armed Bandits , 2022 .

[7]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[8]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[9]  Li Zhang,et al.  Information sharing in distributed stochastic bandits , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[10]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[11]  Hai Jin,et al.  Communication cost efficient virtualized network function placement for big data processing , 2016, 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[12]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[13]  Mihaela van der Schaar,et al.  Distributed online Big Data classification using context information , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[14]  David A. Shamma,et al.  The New Data and New Challenges in Multimedia Research , 2015, ArXiv.

[15]  Atilla Eryilmaz,et al.  Multi-armed bandits in the presence of side observations in social networks , 2013, 52nd IEEE Conference on Decision and Control.

[16]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[17]  Iordanis Koutsopoulos,et al.  Streaming big data meets backpressure in distributed network computation , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[18]  Yue Gao,et al.  Differentially private publication of general time-serial trajectory data , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[19]  Mihaela van der Schaar,et al.  Distributed Online Learning in Social Recommender Systems , 2013, IEEE Journal of Selected Topics in Signal Processing.