SoK: Modular and Efficient Private Decision Tree Evaluation

Abstract Decision trees and random forests are widely used classifiers in machine learning. Service providers often host classification models in a cloud service and provide an interface for clients to use the model remotely. While the model is sensitive information of the server, the input query and prediction results are sensitive information of the client. This motivates the need for private decision tree evaluation, where the service provider does not learn the client’s input and the client does not learn the model except for its size and the result. In this work, we identify the three phases of private decision tree evaluation protocols: feature selection, comparison, and path evaluation. We systematize constant-round protocols for each of these phases to identify the best available instantiations using the two main paradigms for secure computation: garbling techniques and homomorphic encryption. There is a natural tradeoff between runtime and communication considering these two paradigms: garbling techniques use fast symmetric-key operations but require a large amount of communication, while homomorphic encryption is computationally heavy but requires little communication. Our contributions are as follows: Firstly, we systematically review and analyse state-of-the-art protocols for the three phases of private decision tree evaluation. Our methodology allows us to identify novel combinations of these protocols that provide better tradeoffs than existing protocols. Thereafter, we empirically evaluate all combinations of these protocols by providing communication and runtime measures, and provide recommendations based on the identified concrete tradeoffs.

[1]  Vladimir Kolesnikov,et al.  A Practical Universal Circuit Construction and Secure Evaluation of Private Functions , 2008, Financial Cryptography.

[2]  Nuno Santos,et al.  Effective Detection of Multimedia Protocol Tunneling using Machine Learning , 2018, USENIX Security Symposium.

[3]  Jonathan Katz,et al.  Private Set Intersection: Are Garbled Circuits Better than Custom Protocols? , 2012, NDSS.

[4]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[5]  Yehuda Lindell,et al.  More efficient oblivious transfer and extensions for faster secure computation , 2013, CCS.

[6]  Michael Naehrig,et al.  Privately Evaluating Decision Trees and Random Forests , 2016, IACR Cryptol. ePrint Arch..

[7]  Claudia A. Marcos,et al.  Using semantic roles to improve text classification in the requirements domain , 2018, Lang. Resour. Evaluation.

[8]  Yao Lu,et al.  Oblivious Neural Network Predictions via MiniONN Transformations , 2017, IACR Cryptol. ePrint Arch..

[9]  Yehuda Lindell,et al.  More Efficient Oblivious Transfer Extensions with Security for Malicious Adversaries , 2015, IACR Cryptol. ePrint Arch..

[10]  Stefan Katzenbeisser,et al.  HyCC: Compilation of Hybrid Protocols for Practical Secure Computation , 2018, CCS.

[11]  Marcel Keller,et al.  Actively Secure OT Extension with Optimal Overhead , 2015, CRYPTO.

[12]  Leila Etaati,et al.  Azure Machine Learning Studio , 2019, Machine Learning with Microsoft Technologies.

[13]  Marc Joye,et al.  Private yet Efficient Decision Tree Evaluation , 2018, DBSec.

[14]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[15]  Ahmad-Reza Sadeghi,et al.  Improved Garbled Circuit Building Blocks and Applications to Auctions and Computing Minima , 2009, IACR Cryptol. ePrint Arch..

[16]  Ivan Martinovic,et al.  MalClassifier: Malware family classification using network flow sequence behaviour , 2018, 2018 APWG Symposium on Electronic Crime Research (eCrime).

[17]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[18]  Andrew Chi-Chih Yao,et al.  How to generate and exchange secrets , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[19]  Ivan Damgård,et al.  Homomorphic encryption and secure comparison , 2008, Int. J. Appl. Cryptogr..

[20]  Anantha Chandrakasan,et al.  Gazelle: A Low Latency Framework for Secure Neural Network Inference , 2018, IACR Cryptol. ePrint Arch..

[21]  Sameer Wagh,et al.  SecureNN: Efficient and Private Neural Network Training , 2018, IACR Cryptol. ePrint Arch..

[22]  Dan Bogdanov,et al.  A Universal Toolkit for Cryptographically Secure Privacy-Preserving Data Mining , 2012, PAISI.

[23]  Russell Impagliazzo,et al.  Limits on the provable consequences of one-way permutations , 1988, STOC '89.

[24]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[25]  Ponnurangam Kumaraguru,et al.  Collective Classification of Spam Campaigners on Twitter: A Hierarchical Meta-Path Based Approach , 2018, WWW.

[26]  David Evans,et al.  Two Halves Make a Whole - Reducing Data Transfer in Garbled Circuits Using Half Gates , 2015, EUROCRYPT.

[27]  Janne Lindqvist,et al.  How Busy Are You?: Predicting the Interruptibility Intensity of Mobile Users , 2017, CHI.

[28]  Vitaly Shmatikov,et al.  Privacy-preserving remote diagnostics , 2007, CCS '07.

[29]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[30]  Θωμάς Αθανασίου Επιλογή χαρακτηριστικών δικτυακής κίνησης και ανίχνευση εισβολών με χρήση του Microsoft Azure Machine Learning Studio , 2018 .

[31]  Stefan Katzenbeisser,et al.  Private Evaluation of Decision Trees using Sublinear Cost , 2019, Proc. Priv. Enhancing Technol..

[32]  Samuel Marchal,et al.  PRADA: Protecting Against DNN Model Stealing Attacks , 2018, 2019 IEEE European Symposium on Security and Privacy (EuroS&P).

[33]  Ahmad-Reza Sadeghi,et al.  VoiceGuard: Secure and Private Speech Processing , 2018, INTERSPEECH.

[34]  Yehuda Lindell,et al.  More Efficient Oblivious Transfer Extensions , 2017, Journal of Cryptology.

[35]  Ben Y. Zhao,et al.  Understanding and Predicting Data Hotspots in Cellular Networks , 2016, Mob. Networks Appl..

[36]  Nigel P. Smart,et al.  Actively Secure Private Function Evaluation , 2014, ASIACRYPT.

[37]  Vijay Arya,et al.  Model Extraction Warning in MLaaS Paradigm , 2017, ACSAC.

[38]  Anderson C. A. Nascimento,et al.  Efficient and Private Scoring of Decision Trees, Support Vector Machines and Logistic Regression Models Based on Pre-Computation , 2019, IEEE Transactions on Dependable and Secure Computing.

[39]  Anat Paskin-Cherniavsky,et al.  Evaluating Branching Programs on Encrypted Data , 2007, TCC.

[40]  Payman Mohassel,et al.  How to Hide Circuits in MPC: An Efficient Framework for Private Function Evaluation , 2013, IACR Cryptol. ePrint Arch..

[41]  Dan Bogdanov,et al.  Sharemind: A Framework for Fast Privacy-Preserving Computations , 2008, ESORICS.

[42]  Thomas Schneider,et al.  Practical Secure Function Evaluation , 2008, Informatiktage.

[43]  Michael Zohner,et al.  ABY - A Framework for Efficient Mixed-Protocol Secure Two-Party Computation , 2015, NDSS.

[44]  Vladimir Kolesnikov,et al.  Improved Garbled Circuit: Free XOR Gates and Applications , 2008, ICALP.

[45]  Muhammad Zubair Shafiq,et al.  Guidelines to Select Machine Learning Scheme for Classification of Biomedical Datasets , 2009, EvoBIO.

[46]  Carmela Troncoso,et al.  Under the Underground: Predicting Private Interactions in Underground Forums , 2018, ArXiv.

[47]  Sherman S. M. Chow,et al.  Privacy-Preserving Decision Trees Evaluation via Linear Functions , 2017, ESORICS.

[48]  Leslie G. Valiant,et al.  Universal circuits (Preliminary Report) , 1976, STOC '76.

[49]  Andrew Chi-Chih Yao,et al.  Protocols for Secure Computations (Extended Abstract) , 1982, FOCS.

[50]  Shafi Goldwasser,et al.  Machine Learning Classification over Encrypted Data , 2015, NDSS.

[51]  Thomas Schneider,et al.  More Efficient Universal Circuit Constructions , 2017, ASIACRYPT.

[52]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[53]  Jun Sakuma,et al.  Non-interactive and Output Expressive Private Comparison from Homomorphic Encryption , 2018, AsiaCCS.

[54]  Silvio Micali,et al.  The Round Complexity of Secure Protocols (Extended Abstract) , 1990, STOC 1990.

[55]  Rachel Greenstadt,et al.  Use of machine learning in big data analytics for insider threat detection , 2015, MILCOM 2015 - 2015 IEEE Military Communications Conference.

[56]  Jonathan Katz,et al.  Constant-Round Private Function Evaluation with Linear Complexity , 2011, ASIACRYPT.

[57]  Ahmad-Reza Sadeghi,et al.  Privacy-Preserving ECG Classification With Branching Programs and Neural Networks , 2011, IEEE Transactions on Information Forensics and Security.

[58]  Mihir Bellare,et al.  Efficient Garbling from a Fixed-Key Blockcipher , 2013, 2013 IEEE Symposium on Security and Privacy.

[59]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[60]  Elaine B. Barker,et al.  SP 800-57. Recommendation for Key Management, Part 1: General (revised) , 2007 .

[61]  Taher El Gamal A public key cryptosystem and a signature scheme based on discrete logarithms , 1984, IEEE Trans. Inf. Theory.

[62]  Ivan Damgård,et al.  Efficient and Secure Comparison for On-Line Auctions , 2007, ACISP.

[63]  Charles L. Wilson,et al.  NIST form-based handprint recognition system (release 2.0) , 1997 .

[64]  Donald Beaver,et al.  Precomputing Oblivious Transfer , 1995, CRYPTO.

[65]  Farinaz Koushanfar,et al.  Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications , 2018, IACR Cryptol. ePrint Arch..

[66]  Ivan Damgård,et al.  A correction to 'efficient and secure comparison for on-line auctions' , 2009, Int. J. Appl. Cryptogr..

[67]  Ahmad-Reza Sadeghi,et al.  Generalized Universal Circuits for Secure Evaluation of Private Functions with Application to Data Classification , 2009, IACR Cryptol. ePrint Arch..

[68]  Thomas Schneider,et al.  Valiant's Universal Circuit is Practical , 2016, EUROCRYPT.

[69]  Jens H. Weber,et al.  Privacy Preserving Decision Tree Learning Using Unrealized Data Sets , 2012, IEEE Transactions on Knowledge and Data Engineering.

[70]  Ahmad-Reza Sadeghi,et al.  Secure Evaluation of Private Linear Branching Programs with Medical Applications , 2009, ESORICS.

[71]  Yuval Ishai,et al.  Extending Oblivious Transfers Efficiently , 2003, CRYPTO.

[72]  Abraham Waksman,et al.  A Permutation Network , 1968, JACM.

[73]  Vitaly Shmatikov,et al.  Privacy-Preserving Classifier Learning , 2009, Financial Cryptography.

[74]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[75]  Chris Clifton,et al.  Privacy-Preserving Decision Trees over Vertically Partitioned Data , 2005, DBSec.

[76]  Ivan Damgård,et al.  A Generalisation, a Simplification and Some Applications of Paillier's Probabilistic Public-Key System , 2001, Public Key Cryptography.