A compressive multi-kernel method for privacy-preserving machine learning

As the analytic tools become more powerful, and more data are generated on a daily basis, the issue of data privacy arises. This leads to the study of the design of privacy-preserving machine learning algorithms. Given two objectives, namely, utility maximization and privacy-loss minimization, this work is based on two previously non-intersecting regimes — Compressive Privacy and multi-kernel method. Compressive Privacy is a privacy framework that employs utility-preserving lossy-encoding scheme to protect the privacy of the data, while multi-kernel method is a kernel-based machine learning regime that explores the idea of using multiple kernels for building better predictors. In relation to the neural-network architecture, multi-kernel method can be described as a two-hidden-layered network with its width proportional to the number of kernels. The compressive multi-kernel method proposed consists of two stages — the compression stage and the multi-kernel stage. The compression stage follows the Compressive Privacy paradigm to provide the desired privacy protection. Each kernel matrix is compressed with a lossy projection matrix derived from the Discriminant Component Analysis (DCA). The multikernel stage uses the signal-to-noise ratio (SNR) score of each kernel to non-uniformly combine multiple compressive kernels. The proposed method is evaluated on two mobile-sensing datasets — MHEALTH and HAR — where activity recognition is defined as utility and person identification is defined as privacy. The results show that the compression regime is successful in privacy preservation as the privacy classification accuracies are almost at the random-guess level in all experiments. On the other hand, the novel SNR-based multi-kernel shows utility classification accuracy improvement upon the state-of-the-art in both datasets. These results indicate a promising direction for research in privacy-preserving machine learning.

[1]  Charles A. Micchelli,et al.  Learning Convex Combinations of Continuously Parameterized Basic Kernels , 2005, COLT.

[2]  Massimo Barbaro,et al.  A Face Is Exposed for AOL Searcher No , 2006 .

[3]  Davide Anguita,et al.  A Public Domain Dataset for Human Activity Recognition using Smartphones , 2013, ESANN.

[4]  Chiou-Shann Fuh,et al.  Multiple Kernel Learning for Dimensionality Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Cheng Soon Ong,et al.  Multiclass multiple kernel learning , 2007, ICML '07.

[6]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[7]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[8]  Francis R. Bach,et al.  Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning , 2008, NIPS.

[9]  Sun-Yuan Kung,et al.  Discriminant component analysis for privacy protection and visualization of big data , 2017, Multimedia Tools and Applications.

[10]  Héctor Pomares,et al.  mHealthDroid: A Novel Framework for Agile Development of Mobile Health Applications , 2014, IWAAL.

[11]  Yung C. Shin,et al.  Sparse Multiple Kernel Learning for Signal Processing Applications , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Dawn Xiaodong Song,et al.  On the Feasibility of Internet-Scale Author Identification , 2012, 2012 IEEE Symposium on Security and Privacy.

[13]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[14]  Mehryar Mohri,et al.  Learning Non-Linear Combinations of Kernels , 2009, NIPS.

[15]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[16]  Sun-Yuan Kung,et al.  Data privacy protection by kernel subspace projection and generalized eigenvalue decomposition , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[17]  Sun-Yuan Kung,et al.  Collaborative PCA/DCA Learning Methods for Compressive Privacy , 2017, ACM Trans. Embed. Comput. Syst..

[18]  S. Kung Kernel Methods and Machine Learning , 2014 .

[19]  Mehryar Mohri,et al.  L2 Regularization for Learning Kernels , 2009, UAI.

[20]  Rong Jin,et al.  Multiple Kernel Learning for Visual Object Recognition: A Review , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Ignacio Rojas,et al.  Design, implementation and validation of a novel open framework for agile development of mobile health applications , 2015, BioMedical Engineering OnLine.

[22]  Yuguang Fang,et al.  Privacy-Preserving Machine Learning Algorithms for Big Data Systems , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[23]  Ling Liu,et al.  Stock Market Volatility Prediction: A Service-Oriented Multi-kernel Learning Approach , 2012, 2012 IEEE Ninth International Conference on Services Computing.

[24]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[25]  Vitaly Shmatikov,et al.  2011 IEEE Symposium on Security and Privacy “You Might Also Like:” Privacy Risks of Collaborative Filtering , 2022 .

[26]  S.Y. Kung,et al.  Compressive Privacy: From Information\/Estimation Theory to Machine Learning [Lecture Notes] , 2017, IEEE Signal Processing Magazine.

[27]  Mehryar Mohri,et al.  Two-Stage Learning Kernel Algorithms , 2010, ICML.

[28]  Charles A. Micchelli,et al.  Learning the Kernel Function via Regularization , 2005, J. Mach. Learn. Res..

[29]  Sebastian Nowozin,et al.  Infinite Kernel Learning , 2008, NIPS 2008.

[30]  William Stafford Noble,et al.  Nonstationary kernel combination , 2006, ICML.

[31]  Vikas Singh,et al.  Q-MKL: Matrix-induced Regularization in Multi-Kernel Learning with Applications to Neuroimaging , 2012, NIPS.

[32]  Sun-Yuan Kung,et al.  Discriminant-component eigenfaces for privacy-preserving face recognition , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).