VoiceGuard: Secure and Private Speech Processing

With the advent of smart-home devices providing voice-based interfaces, such as Amazon Alexa or Apple Siri, voice data is constantly transferred to cloud services for automated speech recognition or speaker verification. While this development enables intriguing new applications, it also poses significant risks: Voice data is highly sensitive since it contains biometric information of the speaker as well as the spoken words. This data may be abused if not protected properly, thus the security and privacy of billions of end-users is at stake. We tackle this challenge by proposing an architecture, dubbed VoiceGuard, that efficiently protects the speech processing task inside a trusted execution environment (TEE). Our solution preserves the privacy of users while at the same time it does not require the service provider to reveal model parameters. Our architecture can be extended to enable user-specific models, such as feature transformations (including fMLLR), i-vectors, or model transformations (e.g., custom output layers). It also generalizes to secure on-premise solutions, allowing vendors to securely ship their models to customers. We provide a proof-of-concept implementation and evaluate it on the Resource Management and WSJ speech recognition tasks isolated with Intel SGX, a widely available TEE implementation, demonstrating even real time processing capabilities.

[1]  Eric Rescorla,et al.  The Transport Layer Security (TLS) Protocol Version 1.2 , 2008, RFC.

[2]  Sebastian Nowozin,et al.  Oblivious Multi-Party Machine Learning on Trusted Processors , 2016, USENIX Security Symposium.

[3]  Srdjan Capkun,et al.  DR.SGX: Hardening SGX Enclaves against Cache Attacks with Data Location Randomization , 2017, ArXiv.

[4]  Ahmad-Reza Sadeghi,et al.  HardIDX: Practical and Secure Index with SGX , 2017, DBSec.

[5]  Yao Lu,et al.  Oblivious Neural Network Predictions via MiniONN Transformations , 2017, IACR Cryptol. ePrint Arch..

[6]  Paul C. Kocher,et al.  Differential Power Analysis , 1999, CRYPTO.

[7]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[8]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[9]  Ahmad-Reza Sadeghi,et al.  Secure Multiparty Computation from SGX , 2017, Financial Cryptography.

[10]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[11]  Juan del Cuvillo,et al.  Using innovative instructions to create trustworthy software solutions , 2013, HASP '13.

[12]  Michael K. Reiter,et al.  Detecting Privileged Side-Channel Attacks in Shielded Execution with Déjà Vu , 2017, AsiaCCS.

[13]  Ittai Anati,et al.  Innovative Technology for CPU Based Attestation and Sealing , 2013 .

[14]  Vitaly Shmatikov,et al.  Chiron: Privacy-preserving Machine Learning as a Service , 2018, ArXiv.

[15]  Marcus Peinado,et al.  T-SGX: Eradicating Controlled-Channel Attacks Against Enclave Programs , 2017, NDSS.

[16]  Farinaz Koushanfar,et al.  Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications , 2018, IACR Cryptol. ePrint Arch..

[17]  Bhavani M. Thuraisingham,et al.  Securing Data Analytics on SGX with Randomization , 2017, ESORICS.

[18]  Aseem Rastogi,et al.  EzPC: Programmable, Efficient, and Scalable Secure Two-Party Computation , 2018, IACR Cryptol. ePrint Arch..

[19]  Muttukrishnan Rajarajan,et al.  Privacy preserving encrypted phonetic search of speech data , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Anantha Chandrakasan,et al.  Gazelle: A Low Latency Framework for Secure Neural Network Inference , 2018, IACR Cryptol. ePrint Arch..

[21]  Donald E. Porter,et al.  Graphene-SGX: A Practical Library OS for Unmodified Applications on SGX , 2017, USENIX Annual Technical Conference.

[22]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[23]  Hassan Takabi,et al.  CryptoDL: Deep Neural Networks over Encrypted Data , 2017, ArXiv.

[24]  Bhiksha Raj,et al.  Privacy-preserving speech processing: cryptographic and string-matching frameworks show promise , 2013, IEEE Signal Processing Magazine.

[25]  Maria Zhdanova,et al.  Time to Rethink: Trust Brokerage Using Trusted Execution Environments , 2015, TRUST.

[26]  Andrew C. Simpson,et al.  Exploring the use of Intel SGX for Secure Many-Party Applications , 2016, SysTEX@Middleware.

[27]  Carlos V. Rozas,et al.  Innovative instructions and software model for isolated execution , 2013, HASP '13.

[28]  Farinaz Koushanfar,et al.  DeepSecure: Scalable Provably-Secure Deep Learning , 2017, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).