Towards Private and Scalable Cross-Media Retrieval

Cross-media retrieval (CMR) is an attractive networked application where a server responds to queries with retrieval results of different modalities. Different from traditional information retrieval, CMR relies on a more enriched set of machine learning techniques to produce semantic models projecting multimodal data into a common space. A larger training dataset usually gives more accurate models, leading to a better retrieval result. Despite very promising with potential underpinnings in network analytics and multimedia applications, applying CMR in such contexts also faces severe privacy challenges, due to the fact that various data scattering among multiple parties may be sensitive and not allowed to be shared publicly. Studies jointly considering cross-media analytics, privacy protection, collaborative learning, and distributed networking contexts, are relatively sparse. In this work, we propose the first practical system for privacy-preserving cross-media retrieval by utilizing trusted processors. Our scheme enables secure aggregation of the data from distinct parties, and secure canonical correlation analysis (CCA) over collaborated data to obtain semantic models. Verification mechanisms are designed to defend against active attacks from a malicious adversary. Furthermore, to deal with large data sets, we provide a set of optimization methods to accomodate to limited trusted memory and improve the efficiency of training process in CMR. We consider issues such as data block splitting to manage memory overhead, ordering of operations as well as parameters reuse and release to simplify I/O, and parallel computation to speed up dual operations. Our experiments over both synthetic and real datasets show that our solution is very efficient in practice, outperforms the existing solutions, and performs comparably with the original CMR system.

[1]  Mihir Bellare,et al.  Foundations of garbled circuits , 2012, CCS.

[2]  Sebastian Nowozin,et al.  Oblivious Multi-Party Machine Learning on Trusted Processors , 2016, USENIX Security Symposium.

[3]  Roger Levy,et al.  On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Yuxin Peng,et al.  Cross-Media Shared Representation by Hierarchical Learning with Multiple Deep Networks , 2016, IJCAI.

[5]  Qian Wang,et al.  Hidden Voice Commands: Attacks and Defenses on the VCS of Autonomous Driving Cars , 2019, IEEE Wireless Communications.

[6]  Xin Li,et al.  CASHEIRS: Cloud assisted scalable hierarchical encrypted based image retrieval system , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[7]  Chong-Wah Ngo,et al.  Learning Query and Image Similarities with Ranking Canonical Correlation Analysis , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[9]  Christos Gkantsidis,et al.  VC3: Trustworthy Data Analytics in the Cloud Using SGX , 2015, 2015 IEEE Symposium on Security and Privacy.

[10]  Qian Wang,et al.  Outsourced Biometric Identification With Privacy , 2018, IEEE Transactions on Information Forensics and Security.

[11]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[12]  C. V. Jawahar,et al.  Multi-label Cross-Modal Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Yao Zhao,et al.  Adversarial Attacks and Defences Competition , 2018, ArXiv.

[14]  G. Stewart Matrix Algorithms, Volume II: Eigensystems , 2001 .

[15]  Qiuping Xu Canonical correlation Analysis , 2014 .

[16]  Shai Halevi,et al.  Algorithms in HElib , 2014, CRYPTO.

[17]  Chanathip Namprempre,et al.  Authenticated Encryption: Relations among Notions and Analysis of the Generic Composition Paradigm , 2000, Journal of Cryptology.

[18]  Mihir Bellare,et al.  Relations among Notions of Security for Public-Key Encryption Schemes , 1998, IACR Cryptol. ePrint Arch..

[19]  Cong Wang,et al.  Harnessing Encrypted Data in Cloud for Secure and Efficient Mobile Image Sharing , 2017, IEEE Transactions on Mobile Computing.

[20]  Ping Wu,et al.  A cross-media distance metric learning framework based on multi-view correlation mining and matching , 2015, World Wide Web.

[21]  Ishwar K. Sethi,et al.  Multimedia content processing through cross-modal association , 2003, MULTIMEDIA '03.

[22]  João Leitão,et al.  Multimodal Indexable Encryption for Mobile Cloud-Based Applications , 2017, 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[23]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[24]  D. McGrew,et al.  The Galois/Counter Mode of Operation (GCM) , 2005 .

[25]  Huimin Lu,et al.  Unsupervised cross-modal retrieval through adversarial learning , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[26]  Yanjiao Chen,et al.  Privacy-Preserving Collaborative Model Learning: The Case of Word Vector Training , 2018, IEEE Transactions on Knowledge and Data Engineering.

[27]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[28]  Wei Liu,et al.  Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[30]  Jian Weng,et al.  Privacy-Preserving Indexing and Query Processing for Secure Dynamic Cloud Storage , 2018, IEEE Transactions on Information Forensics and Security.

[31]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Shafi Goldwasser,et al.  Machine Learning Classification over Encrypted Data , 2015, NDSS.

[33]  Ruifan Li,et al.  Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.

[34]  Kui Ren,et al.  Learning privately: Privacy-preserving canonical correlation analysis for cross-media retrieval , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[35]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[36]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[37]  Stratis Ioannidis,et al.  Privacy-preserving matrix factorization , 2013, CCS.

[38]  Ittai Anati,et al.  Innovative Technology for CPU Based Attestation and Sealing , 2013 .

[39]  Dongsu Han,et al.  Enhancing Security and Privacy of Tor's Ecosystem by Using Trusted Execution Environments , 2017, NSDI.

[40]  Nitish Srivastava,et al.  Learning Representations for Multimodal Data with Deep Belief Nets , 2012 .

[41]  Marcus Peinado,et al.  High-Resolution Side Channels for Untrusted Operating Systems , 2017, USENIX Annual Technical Conference.

[42]  Dongdai Lin,et al.  Racing in Hyperspace: Closing Hyper-Threading Side Channels on SGX with Contrived Data Races , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[43]  Michael Isard,et al.  A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.

[44]  Srinivas Devadas,et al.  Sanctum: Minimal Hardware Extensions for Strong Software Isolation , 2016, USENIX Security Symposium.

[45]  Yang Yang,et al.  Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.

[46]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.