Exploring the Impact of Data Poisoning Attacks on Machine Learning Model Reliability

Abstract Recent years have seen the widespread adoption of Artificial Intelligence techniques in several domains, including healthcare, justice, assisted driving and Natural Language Processing (NLP) based applications (e.g., the Fake News detection). Those mentioned are just a few examples of some domains that are particularly critical and sensitive to the reliability of the adopted machine learning systems. Therefore, several Artificial Intelligence approaches were adopted as support to realize easy and reliable solutions aimed at improving the early diagnosis, personalized treatment, remote patient monitoring and better decision-making with a consequent reduction of healthcare costs. Recent studies have shown that these techniques are venerable to attacks by adversaries at phases of artificial intelligence. Poisoned data set are the most common attack to the reliability of Artificial Intelligence approaches. Noise, for example, can have a significant impact on the overall performance of a machine learning model. This study discusses the strength of impact of noise on classification algorithms. In detail, the reliability of several machine learning techniques to distinguish correctly pathological and healthy voices by analysing poisoning data was evaluated. Voice samples selected by available database, widely used in research sector, the Saarbruecken Voice Database, were processed and analysed to evaluate the resilience and classification accuracy of these techniques. All analyses are evaluated in terms of accuracy, specificity, sensitivity, F1-score and ROC area.

[1]  S. Linville,et al.  The Effects of Age on the Voice , 2006 .

[2]  Giovanna Sannino,et al.  Voice Disorder Identification by Using Machine Learning Techniques , 2018, IEEE Access.

[3]  Jianglin Wang,et al.  Vocal Folds Disorder Detection using Pattern Recognition Methods , 2007, 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[4]  Jorge Cadima,et al.  Principal component analysis: a review and recent developments , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[5]  Shi Feng,et al.  Concealed Data Poisoning Attacks on NLP Models , 2021, NAACL.

[6]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[7]  Amir Globerson,et al.  Nightmare at test time: robust learning by feature deletion , 2006, ICML.

[8]  Yufeng Li,et al.  A Backdoor Attack Against LSTM-Based Text Classification Systems , 2019, IEEE Access.

[9]  Ghulam Muhammad,et al.  Development of the Arabic Voice Pathology Database and Its Evaluation by Using Speech Features and Machine Learning Algorithms , 2017, Journal of healthcare engineering.

[10]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[11]  Fahad Taha Al-Dhief,et al.  A Survey of Voice Pathology Surveillance Systems Based on Internet of Things and Machine Learning Algorithms , 2020, IEEE Access.

[12]  B Boyanov,et al.  Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases. , 1997, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[13]  Ben Barsties V Latoszek,et al.  The Influence of Gender and Age on the Acoustic Voice Quality Index and Dysphonia Severity Index: A Normative Study. , 2017, Journal of voice : official journal of the Voice Foundation.

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  Aleksander Kolcz,et al.  Feature Weighting for Improved Classifier Robustness , 2009, CEAS 2009.

[16]  Adnane Cherif,et al.  Dimensionality reduction for voice disorders identification system based on Mel Frequency Cepstral Coefficients and Support Vector Machine , 2015, 2015 7th International Conference on Modelling, Identification and Control (ICMIC).

[17]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[18]  Mireia Farrús,et al.  Jitter and shimmer measurements for speaker recognition , 2007, INTERSPEECH.

[19]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[20]  W. N. H. W. Mohamed,et al.  A comparative study of Reduced Error Pruning method in decision tree algorithms , 2012, 2012 IEEE International Conference on Control System, Computing and Engineering.

[21]  Tim Ritchings,et al.  Pathological voice quality assesment using artificial neural networks , 2001, MAVEBA.

[22]  Bogdan Woldert-Jokisz,et al.  Saarbruecken Voice Database , 2007 .

[23]  Giuseppe De Pietro,et al.  A methodology for voice classification based on the personalized fundamental frequency estimation , 2018, Biomed. Signal Process. Control..

[24]  Jagannath Nirmal,et al.  Wavelet sub-band features for voice disorder detection and classification , 2020, Multimedia Tools and Applications.

[25]  Michael Wolf,et al.  A clinical comparison between MDVP and Praat softwares: is there a difference? , 2007, MAVEBA.

[26]  V. Radha,et al.  A voice activity detector using SVM and Naïve Bayes classification algorithm , 2017, 2017 International Conference on Signal Processing and Communication (ICSPC).

[27]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[28]  Xuancheng Ren,et al.  Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models , 2021, NAACL.

[29]  Thierry Dutoit,et al.  HNR EXTRACTION IN VOICED SPEECH, ORIENTED TOWARDS VOICE QUALITY ANALYSIS , 2005 .

[30]  Jacques Koreman,et al.  A GERMAN DATABASE OF PATTERNS OF PATHOLOGICAL VOCAL FOLD VIBRATION , 1997 .