论文信息 - Multi-party Poisoning through Generalized p-Tampering

Multi-party Poisoning through Generalized p-Tampering

In a poisoning attack against a learning algorithm, an adversary tampers with a fraction of the training data $T$ with the goal of increasing the classification error of the constructed hypothesis/model over the final test distribution. In the distributed setting, $T$ might be gathered gradually from $m$ data providers $P_1,\dots,P_m$ who generate and submit their shares of $T$ in an online way. In this work, we initiate a formal study of $(k,p)$-poisoning attacks in which an adversary controls $k\in[n]$ of the parties, and even for each corrupted party $P_i$, the adversary submits some poisoned data $T'_i$ on behalf of $P_i$ that is still "$(1-p)$-close" to the correct data $T_i$ (e.g., $1-p$ fraction of $T'_i$ is still honestly generated). For $k=m$, this model becomes the traditional notion of poisoning, and for $p=1$ it coincides with the standard notion of corruption in multi-party computation. We prove that if there is an initial constant error for the generated hypothesis $h$, there is always a $(k,p)$-poisoning attacker who can decrease the confidence of $h$ (to have a small error), or alternatively increase the error of $h$, by $\Omega(p \cdot k/m)$. Our attacks can be implemented in polynomial time given samples from the correct data, and they use no wrong labels if the original distributions are not noisy. At a technical level, we prove a general lemma about biasing bounded functions $f(x_1,\dots,x_n)\in[0,1]$ through an attack model in which each block $x_i$ might be controlled by an adversary with marginal probability $p$ in an online way. When the probabilities are independent, this coincides with the model of $p$-tampering attacks, thus we call our model generalized $p$-tampering. We prove the power of such attacks by incorporating ideas from the context of coin-flipping attacks into the $p$-tampering model and generalize the results in both of these areas.

[1] Leslie G. Valiant,et al. Learning Disjunction of Conjunctions , 1985, IJCAI.

[2] Miklos Santha,et al. Generating Quasi-random Sequences from Semi-random Sources , 1986, J. Comput. Syst. Sci..

[3] Ming Li,et al. Learning in the presence of malicious errors , 1993, STOC '88.

[4] Oded Goldreich,et al. Unbiased Bits from Sources of Weak Randomness and Probabilistic Communication Complexity , 1988, SIAM J. Comput..

[5] Michael E. Saks,et al. Some extremal problems arising from discrete control processes , 1989, Comb..

[6] Nathan Linial,et al. Collective Coin Flipping , 1989, Adv. Comput. Res..

[7] Moni Naor,et al. Adaptively secure multi-party computation , 1996, STOC '96.

[8] Eyal Kushilevitz,et al. PAC learning with nasty noise , 1999, Theoretical Computer Science.

[9] Yevgeniy Dodis,et al. New Imperfect Random Source with Applications to Coin-Flipping , 2001, ICALP.

[10] Avi Wigderson,et al. A note on ex-tracting randomness from Santha-Vazirani sources , 2004 .

[11] Amit Sahai,et al. On the (im)possibility of cryptography with imperfect randomness , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[12] Yevgeniy Dodis,et al. On the Impossibility of Extracting Classical Randomness Using a Quantum Computer , 2006, ICALP.

[13] Amit Sahai,et al. On the Computational Complexity of Coin Flipping , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[14] Eran Omri,et al. Coin Flipping with Constant Bias Implies One-Way Functions , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[15] Kai-Min Chung,et al. On the Impossibility of Cryptography with Tamperable Randomness , 2014, Algorithmica.

[16] Itay Berman,et al. Coin flipping of any constant bias implies one-way functions , 2014, STOC.

[17] Claudia Eckert,et al. Is Feature Selection Secure against Training Data Poisoning? , 2015, ICML.

[18] Yevgeniy Dodis,et al. Privacy with Imperfect Randomness , 2015, CRYPTO.

[19] Yael Tauman Kalai,et al. Adaptively Secure Coin-Flipping, Revisited , 2015, ICALP.

[20] Amin Gohari,et al. Deterministic Randomness Extraction from Generalized and Distributed Santha-Vazirani Sources , 2015, ICALP.

[21] Prateek Saxena,et al. Auror: defending against poisoning attacks in collaborative deep learning systems , 2016, ACSAC.

[22] Peter Richtárik,et al. Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[23] Michael P. Wellman,et al. Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[24] Maria-Florina Balcan,et al. The Power of Localization for Efficiently Learning Linear Separators with Noise , 2013, J. ACM.

[25] Saeed Mahloujifar,et al. Blockwise p-Tampering Attacks on Cryptographic Primitives, Extractors, and Learners , 2017, TCC.

[26] Rachid Guerraoui,et al. Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent , 2017, NIPS.

[27] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[28] Sarvar Patel,et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[29] Niv Buchbinder,et al. Fair Coin Flipping: Tighter Analysis and the Many-Party Case , 2017, SODA.

[30] J. Liao,et al. Sharpening Jensen's Inequality , 2017, The American Statistician.

[31] Michael P. Wellman,et al. SoK: Security and Privacy in Machine Learning , 2018, 2018 IEEE European Symposium on Security and Privacy (EuroS&P).

[32] Ivan Beschastnikh,et al. Mitigating Sybils in Federated Learning Poisoning , 2018, ArXiv.

[33] Salman Beigi,et al. Optimal Deterministic Extractors for Generalized Santha-Vazirani Sources , 2018, APPROX-RANDOM.

[34] Vitaly Shmatikov,et al. How To Backdoor Federated Learning , 2018, AISTATS.