Creating Simple Adversarial Examples for Speech Recognition Deep Neural Networks

The use of deep neural networks for speech recognition and recognizing speech commands continues to grow. This necessitates an understanding of the security risks that goes along with this technology. This paper analyzes the ability to interfere with the performance of neural networks for speech pattern recognition. With the methods proposed herein, it is a simple matter to create adversarial data by overlaying audio of a command at a fairly unnoticeable amplitude. This causes the neural network to lose around 20% accuracy and misidentify commands for other commands with an average to high confidence value. Such an attack is virtually undetectable to the human ear.

[1]  Dawn Song,et al.  Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[2]  David A. Wagner,et al.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[3]  Rommel N. Carvalho,et al.  Deep Learning Anomaly Detection as Support Fraud Investigation in Brazilian Exports and Anti-Money Laundering , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[4]  Philip H. S. Torr,et al.  On the Robustness of Semantic Segmentation Models to Adversarial Attacks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[6]  Aren Jansen,et al.  CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[8]  Zibin Zheng,et al.  Wide and Deep Convolutional Neural Networks for Electricity-Theft Detection to Secure Smart Grids , 2018, IEEE Transactions on Industrial Informatics.

[9]  Giuseppe Salvo,et al.  Bus speed estimation by neural networks to improve the automatic fleet management , 2007 .

[10]  Christian Poellabauer,et al.  Crafting Adversarial Examples For Speech Paralinguistics Applications , 2017, ArXiv.

[11]  Lujo Bauer,et al.  Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.

[12]  Qi Tian,et al.  HMM-Based Audio Keyword Generation , 2004, PCM.

[13]  Lie Lu,et al.  Audio Classification , 2009, Encyclopedia of Database Systems.

[14]  Nikita Vemuri,et al.  Targeted Adversarial Examples for Black Box Audio Systems , 2018, 2019 IEEE Security and Privacy Workshops (SPW).

[15]  June-Goo Lee,et al.  Deep Learning in Medical Imaging: General Overview , 2017, Korean journal of radiology.

[16]  Klaus-Robert Müller,et al.  Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals , 2018, ArXiv.

[17]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[18]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.