Service Integrity Assurance for Distributed Computation Outsourcing

In this paper, we propose a method to ensure service integrity for distributed computation outsourcing in different contexts, such as volunteer computing, grid computing, MapReduce computing, and crowdsourcing. We propose replicating the computations among multiple workers and manipulating the input of the replicated computations by mixing or adding noise. The idea of manipulating the inputs of the replicated computations, which is introduced in this paper for the first time, prevents attacks made by cheater or malicious workers, even in the case of collusion among these workers. By mixing inputs, collusive cheater workers are prevented from easily detecting replication; thus, they do not return random or partial results as output. Adding noise to the inputs prevents cheater and malicious workers from detecting replication, even in the case of collusion among workers. Therefore, they do not return incorrect output. To evaluate the proposed method, we adopt a game-theoretic approach. We also provide a simulation environment to evaluate the overhead of the proposed method. We demonstrate that the proposed method can guarantee high accuracy with low overhead, even when malicious workers constitute the majority of the workers. For instance, the proposed method can ensure high accuracy with 61.27 percent overhead in the case of 60 percent malicious workers.

[1]  Yucong Duan,et al.  Securing MapReduce Result Integrity via Verification-based Integrity Assurance Framework , 2014 .

[2]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3]  Murat Kantarcioglu,et al.  TrustMR: Computation integrity assurance system for MapReduce , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[4]  Wenliang Du,et al.  Uncheatable grid computing , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[5]  Gilles Fedak,et al.  Distributed Results Checking for MapReduce in Volunteer Computing , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[6]  David Sánchez,et al.  Semantic Noise: Privacy-Protection of Nominal Microdata through Uncorrelated Noise Addition , 2015, 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI).

[7]  Rajeev Motwani,et al.  Two Can Keep A Secret: A Distributed Architecture for Secure Database Services , 2005, CIDR.

[8]  Ittay Eyal,et al.  The Miner's Dilemma , 2014, 2015 IEEE Symposium on Security and Privacy.

[9]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[10]  Michael Dahlin,et al.  Volunteer Cloud Computing: MapReduce over the Internet , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[11]  Ting Yu,et al.  SecureMR: A Service Integrity Assurance Framework for MapReduce , 2009, 2009 Annual Computer Security Applications Conference.

[12]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[13]  Chris GauthierDickey,et al.  Result verification and trust-based scheduling in peer-to-peer grids , 2005, Fifth IEEE International Conference on Peer-to-Peer Computing (P2P'05).

[14]  Meni Rosenfeld,et al.  Analysis of Bitcoin Pooled Mining Reward Systems , 2011, ArXiv.

[15]  Nicola Guarino,et al.  Formal Ontology and Information Systems , 1998 .

[16]  John C. Tang,et al.  Reflecting on the DARPA Red Balloon Challenge , 2011, Commun. ACM.

[17]  Elisa Bertino,et al.  Protecting outsourced data in cloud computing through access management , 2016, Concurr. Comput. Pract. Exp..

[18]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[19]  Philippe Golle,et al.  Secure Distributed Computing in a Commercial Environment , 2002, Financial Cryptography.

[20]  Álvaro Enrique Arenas,et al.  Defeating Colluding Nodes in Desktop Grid Computing Platforms , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[21]  Craig Gentry,et al.  Pinocchio: Nearly Practical Verifiable Computation , 2013, 2013 IEEE Symposium on Security and Privacy.

[22]  Arnold L. Rosenberg,et al.  On the cost-ineffectiveness of redundancy in commercial P2P computing , 2005, CCS '05.

[23]  Jinpeng Wei,et al.  VIAF: Verification-Based Integrity Assurance Framework for MapReduce , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[24]  Markus Jakobsson,et al.  Controlling data in the cloud: outsourcing computation without outsourcing control , 2009, CCSW '09.

[25]  Sencun Zhu,et al.  Towards Trusted Services: Result Verification Schemes for MapReduce , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[26]  Craig Gentry,et al.  Non-interactive Verifiable Computing: Outsourcing Computation to Untrusted Workers , 2010, CRYPTO.

[27]  Xiaohong Jiang,et al.  MtMR: Ensuring MapReduce Computation Integrity with Merkle Tree-Based Verifications , 2018, IEEE Transactions on Big Data.

[28]  Jacques Cohen,et al.  Non-Deterministic Algorithms , 1979, CSUR.

[29]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[30]  Anirban Chakrabarti,et al.  Grid Computing Security: A Taxonomy , 2008, IEEE Security & Privacy.

[31]  Philippe Golle,et al.  Uncheatable Distributed Computations , 2001, CT-RSA.

[32]  Yulong Ren,et al.  A service integrity assurance framework for cloud computing based on MapReduce , 2012, 2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems.

[33]  Doug Szajda,et al.  Hardening functions for large scale distributed computations , 2003, 2003 Symposium on Security and Privacy, 2003..

[34]  Ruth Brand,et al.  Microdata Protection through Noise Addition , 2002, Inference Control in Statistical Databases.

[35]  Josep Domingo-Ferrer,et al.  Statistical Disclosure Control , 2012 .

[36]  Muthu Dayalan,et al.  MapReduce : Simplified Data Processing on Large Cluster , 2018 .

[37]  David P. Anderson Volunteer computing , 2010, CROS.

[38]  Srinath T. V. Setty,et al.  Making argument systems for outsourced computation practical (sometimes) , 2012, NDSS.

[39]  David Sánchez,et al.  Semantically-grounded construction of centroids for datasets with textual attributes , 2012, Knowl. Based Syst..

[40]  Masaru Fukushi,et al.  Optimal Spot-checking for Computation Time Minimization in Volunteer Computing , 2009, Journal of Grid Computing.

[41]  Ahmad-Reza Sadeghi,et al.  Token-Based Cloud Computing , 2010, TRUST.

[42]  Gilles Fedak,et al.  BitDew: A programmable environment for large-scale data management and distribution , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[43]  Yang Xiao,et al.  Achieving Accountable MapReduce in cloud computing , 2014, Future Gener. Comput. Syst..

[44]  Jon Howell,et al.  Geppetto: Versatile Verifiable Computation , 2015, 2015 IEEE Symposium on Security and Privacy.

[45]  Radha Poovendran,et al.  A Survey on Mix Networks and Their Secure Applications , 2006, Proceedings of the IEEE.

[46]  Huseyin Polat,et al.  A survey: deriving private information from perturbed data , 2015, Artificial Intelligence Review.

[47]  Iyad Rahwan,et al.  Error and attack tolerance of collective problem solving: The DARPA Shredder Challenge , 2014, EPJ Data Science.