PPCA: Privacy-preserving Principal Component Analysis Using Secure Multiparty Computation(MPC)

Privacy-preserving data mining has become an important topic. People have built several multi-party-computation (MPC)-based frameworks to provide theoretically guaranteed privacy, the poor performance of real-world algorithms have always been a challenge. Using Principal Component Analysis (PCA) as an example, we show that by considering the unique performance characters of the MPC platform, we can design highly effective algorithm-level optimizations, such as replacing expensive operators and batching up. We achieve about 200× performance boost over existing privacy-preserving PCA algorithms with the same level of privacy guarantee. Also, using real-world datasets, we show that by combining multi-party data, we can achieve better training results.

[1]  Charu C. Aggarwal,et al.  Theoretical Foundations and Algorithms for Outlier Ensembles , 2015, SKDD.

[2]  Li Wang,et al.  Industrial Scale Privacy Preserving Deep Neural Network , 2020, ArXiv.

[3]  V. Hari,et al.  Convergence of the Cyclic and Quasi-cyclic Block Jacobi Methods , 2014, 1604.05825.

[4]  Yunghsiang Sam Han,et al.  Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification , 2004, SDM.

[5]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[6]  Yuval Elovici,et al.  N-BaIoT—Network-Based Detection of IoT Botnet Attacks Using Deep Autoencoders , 2018, IEEE Pervasive Computing.

[7]  Jack Dongarra,et al.  Working Note 17: Experiments with QR/QL Methods For The Symmetric Tridiagonal Eigenproblem , 1989 .

[8]  Xiaodong Lin,et al.  Privacy preserving regression modelling via distributed computation , 2004, KDD.

[9]  Dan Bogdanov,et al.  Sharemind: A Framework for Fast Privacy-Preserving Computations , 2008, ESORICS.

[10]  Bingzhe Wu,et al.  Practical Privacy Preserving POI Recommendation , 2020, ACM Trans. Intell. Syst. Technol..

[11]  Kristján Valur Jónsson,et al.  Secure Multi-Party Sorting and Applications , 2011, IACR Cryptol. ePrint Arch..

[12]  Assaf Schuster,et al.  Data mining with differential privacy , 2010, KDD.

[13]  Marcus Peinado,et al.  T-SGX: Eradicating Controlled-Channel Attacks Against Enclave Programs , 2017, NDSS.

[14]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[15]  Yaping Lin,et al.  A Privacy-Preserving Principal Component Analysis Outsourcing Framework , 2018, 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE).

[16]  Silvio Micali,et al.  Proofs that yield nothing but their validity and a methodology of cryptographic protocol design , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Wei Xu,et al.  PrivPy: General and Scalable Privacy-Preserving Data Mining , 2019, KDD.

[19]  A. Sameh On Jacobi and Jacobi-like algorithms for a parallel computer , 1971 .

[20]  Li Wang,et al.  Privacy Preserving PCA for Multiparty Modeling , 2020, ArXiv.

[21]  Kazuo Ohta,et al.  Multiparty Computation for Interval, Equality, and Comparison Without Bit-Decomposition Protocol , 2007, Public Key Cryptography.

[22]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[23]  Jeffery D. Rutter LAPACK Working Note 69: A Serial Implementation of Cuppen''s Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem , 1994 .

[24]  Stratis Ioannidis,et al.  Privacy-preserving matrix factorization , 2013, CCS.

[25]  Shuguo Han,et al.  Privacy-Preserving Linear Fisher Discriminant Analysis , 2008, PAKDD.

[26]  Sun-Yuan Kung,et al.  Privacy-preserving PCA on horizontally-partitioned data , 2017, 2017 IEEE Conference on Dependable and Secure Computing.

[27]  Philip S. Yu,et al.  Privacy-Preserving Singular Value Decomposition , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[28]  Song Han,et al.  Deep Leakage from Gradients , 2019, NeurIPS.

[29]  Ivan Damgård,et al.  Multiparty Computation from Somewhat Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[30]  Rebecca N. Wright,et al.  Privacy-preserving distributed k-means clustering over arbitrarily partitioned data , 2005, KDD '05.

[31]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[32]  Jagath C. Rajapakse,et al.  FPGA Implementations of Neural Networks , 2006 .

[33]  Daniel Rueckert,et al.  A generic framework for privacy preserving deep learning , 2018, ArXiv.

[34]  Bhiksha Raj,et al.  Privacy Preserving Protocols for Eigenvector Computation , 2010, PSDML.

[35]  Juan-Zi Li,et al.  MOOCCube: A Large-scale Data Repository for NLP Applications in MOOCs , 2020, ACL.

[36]  Jared Saia,et al.  Secure Multi-party Shuffling , 2015, SIROCCO.

[37]  Bo Zhao,et al.  iDLG: Improved Deep Leakage from Gradients , 2020, ArXiv.

[38]  Changyu Dong,et al.  When private set intersection meets big data: an efficient and scalable protocol , 2013, CCS.

[39]  J. Crowcroft,et al.  Federated Principal Component Analysis , 2019, NeurIPS.

[40]  Abdelmadjid Bouabdallah,et al.  Trusted Execution Environment: What It is, and What It is Not , 2015, TrustCom 2015.

[41]  Hugo Krawczyk,et al.  Secret Sharing Made Short , 1994, CRYPTO.

[42]  Haiyang Wang,et al.  A Comprehensive Benchmark of the Artificial Immune Recognition System (AIRS) , 2005, ADMA.

[43]  Michael Zohner,et al.  ABY - A Framework for Efficient Mixed-Protocol Secure Two-Party Computation , 2015, NDSS.

[44]  Constance Morel,et al.  Privacy-Preserving Classification on Deep Neural Network , 2017, IACR Cryptol. ePrint Arch..

[45]  Sheng Zhong,et al.  Privacy-Preserving Classification of Customer Data without Loss of Accuracy , 2005, SDM.

[46]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[47]  LinXiaodong,et al.  Privacy-preserving clustering with distributed EM mixture modeling , 2005 .

[48]  Travis E. Oliphant,et al.  Guide to NumPy , 2015 .

[49]  Liang Li,et al.  Secure Social Recommendation based on Secret Sharing , 2020, ECAI.