Secure Tera-scale Data Crunching with a Small TCB

Outsourcing services to third-party providers comes with a high security cost—to fully trust the providers. Using trusted hardware can help, but current trusted execution environments do not adequately support services that process very large scale datasets. We present LAST-GT, a system that bridges this gap by supporting the execution of self-contained services over a large state, with a small and generic trusted computing base (TCB). LAST-GT uses widely deployed trusted hardware to guarantee integrity and verifiability of the execution on a remote platform, and it securely supplies data to the service through simple techniques based on virtual memory. As a result, LAST-GT is general and applicable to many scenarios such as computational genomics and databases, as we show in our experimental evaluation based on an implementation of LAST-GT on a secure hypervisor. We also describe a possible implementation on Intel SGX.

[1]  Christos Faloutsos,et al.  Polonium: Tera-Scale Graph Mining and Inference for Malware Detection , 2011 .

[2]  Emmett Witchel,et al.  Ryoan: A Distributed Sandbox for Untrusted Computation on Secret Data , 2016, OSDI.

[3]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[4]  Christos Gkantsidis,et al.  VC3: Trustworthy Data Analytics in the Cloud Using SGX , 2015, 2015 IEEE Symposium on Security and Privacy.

[5]  Hovav Shacham,et al.  Iago attacks: why the system call API is a bad untrusted RPC interface , 2013, ASPLOS '13.

[6]  Jorge A. Gálvez,et al.  A Review of Analytics and Clinical Informatics in Health Care , 2014, Journal of Medical Systems.

[7]  Biological Sequences and the Exact String Matching Problem Universal Turing Machine , 2011 .

[8]  Nuno Ferreira Neves,et al.  Securing Passive Replication through Verification , 2015, 2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS).

[9]  Shweta Shinde,et al.  Panoply: Low-TCB Linux Applications With SGX Enclaves , 2017, NDSS.

[10]  Nuno Ferreira Neves,et al.  Secure Identification of Actively Executed Code on a Generic Trusted Component , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[11]  K Kasikumar,et al.  Applications of Data Mining Techniques in Healthcare and Prediction of Heart Attacks , 2018, International Journal of Data Mining Techniques and Applications.

[12]  Galen C. Hunt,et al.  Secure execution of unmodified applications on an untrusted host , 2013 .

[13]  Cong Wang,et al.  Security Challenges for the Public Cloud , 2012, IEEE Internet Computing.

[14]  Hovav Shacham,et al.  Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds , 2009, CCS.

[15]  Miguel Correia,et al.  On the Feasibility of Byzantine Fault-Tolerant MapReduce in Clouds-of-Clouds , 2012, 2012 IEEE 31st Symposium on Reliable Distributed Systems.

[16]  David M. Eyers,et al.  SCONE: Secure Linux Containers with Intel SGX , 2016, OSDI.

[17]  James Newsome,et al.  Design, Implementation and Verification of an eXtensible and Modular Hypervisor Framework , 2013, 2013 IEEE Symposium on Security and Privacy.

[18]  Ting Yu,et al.  SecureMR: A Service Integrity Assurance Framework for MapReduce , 2009, 2009 Annual Computer Security Applications Conference.

[19]  Murat Kantarcioglu,et al.  TrustMR: Computation integrity assurance system for MapReduce , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[20]  Andrew S. Tanenbaum Lessons learned from 30 years of MINIX , 2016, Commun. ACM.

[21]  William R. Claycomb,et al.  Insider Threats to Cloud Computing: Directions for New Research Challenges , 2012, 2012 IEEE 36th Annual Computer Software and Applications Conference.

[22]  Beng Chin Ooi,et al.  M2R: Enabling Stronger Privacy in MapReduce Computation , 2015, USENIX Security Symposium.

[23]  Adrian Perrig,et al.  TrustVisor: Efficient TCB Reduction and Attestation , 2010, 2010 IEEE Symposium on Security and Privacy.

[24]  Radu Sion,et al.  TrustedDB: A Trusted Hardware-Based Database with Privacy and Data Confidentiality , 2011, IEEE Transactions on Knowledge and Data Engineering.

[25]  Jing Zhang,et al.  The real cost of sequencing: scaling computation to keep pace with data generation , 2016, Genome biology.

[26]  James Newsome,et al.  MiniBox: A Two-Way Sandbox for x86 Native Code , 2014, USENIX ATC.

[27]  Galen C. Hunt,et al.  Shielding Applications from an Untrusted Cloud with Haven , 2014, OSDI.