The Audio Auditor: Participant-Level Membership Inference in Voice-Based IoT

Voice interfaces and assistants implemented by various services have become increasingly sophisticated, powered by increased availability of data. However, users’ audio data needs to be guarded while enforcing data-protection regulations, such as the GDPR law and the COPPA law. To check the unauthorized use of audio data, we propose an audio auditor for users to audit speech recognition models. Specifically, users can check whether their audio recordings were used as a member of the model’s training dataset or not. In this paper, we focus our work on a DNN-HMM-based automatic speech recognition model over the TIMIT audio data. As a proof-of-concept, the success rate of participant-level membership inference can reach up to 90% with eight audio samples per user, resulting in an audio auditor.

[1]  Maya Cakmak,et al.  Toys that Listen: A Study of Parents, Children, and Internet-Connected Toys , 2017, CHI.

[2]  Priyan Malarvizhi Kumar,et al.  An Automatic Tamil Speech Recognition system by using Bidirectional Recurrent Neural Network with Self-Organizing Map , 2019, Neural Computing and Applications.

[3]  Mark J. F. Gales,et al.  Speech recognition and keyword spotting for low-resource languages: Babel project research at CUED , 2014, SLTU.

[4]  Björn W. Schuller,et al.  Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.

[5]  John A. Weaver,et al.  And What Will You Do With It , 2011 .

[6]  Eamonn J. Keogh,et al.  Discovery of Meaningful Rules in Time Series , 2015, KDD.

[7]  Srinivas Bangalore,et al.  Personalized speech recognition for Internet of Things , 2015, 2015 IEEE 2nd World Forum on Internet of Things (WF-IoT).

[8]  Dong Yu,et al.  Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.

[9]  Vitaly Shmatikov,et al.  The Natural Auditor: How To Tell If Someone Used Your Words To Train Their Model , 2018, ArXiv.

[10]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[11]  Yaohui Jin,et al.  Multi-Task Label Embedding for Text Classification , 2017, EMNLP.

[12]  Choochart Haruechaiyasak,et al.  Thai Word Recognition Using Hybrid MLP-HMM , 2010 .

[13]  Kandarpa Kumar Sarma,et al.  Closed-Set Text-Independent Speaker Identification System Using Multiple ANN Classifiers , 2014, FICTA.

[14]  Titouan Parcollet,et al.  The Pytorch-kaldi Speech Recognition Toolkit , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Dorothea Kolossa,et al.  Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding , 2018, NDSS.