Adversarial Attacks and Defenses
暂无分享,去创建一个
Ruocheng Guo | Ninghao Liu | Huan Liu | Mengnan Du | Xia Hu | Huan Liu | Mengnan Du | Ninghao Liu | Xia Hu | Ruocheng Guo
[1] Hongxia Yang,et al. Learning Disentangled Representations for Recommendation , 2019, NeurIPS.
[2] Yarin Gal,et al. Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.
[3] Xiangyu Zhang,et al. Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples , 2018, NeurIPS.
[4] Zhitao Gong,et al. Adversarial and Clean Data Are Not Twins , 2017, aiDM@SIGMOD.
[5] Anind K. Dey,et al. Why and why not explanations improve the intelligibility of context-aware intelligent systems , 2009, CHI.
[6] Xia Hu,et al. Learning Credible Deep Neural Networks with Rationale Regularization , 2019, 2019 IEEE International Conference on Data Mining (ICDM).
[7] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.
[8] Cynthia Rudin,et al. This Looks Like That: Deep Learning for Interpretable Image Recognition , 2018 .
[9] Yan Liu,et al. Distilling Knowledge from Deep Networks with Applications to Healthcare Domain , 2015, ArXiv.
[10] David A. Forsyth,et al. SafetyNet: Detecting and Rejecting Adversarial Examples Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[11] Xia Hu,et al. Are Interpretations Fairly Evaluated? A Definition Driven Pipeline for Post-Hoc Interpretability , 2020, ArXiv.
[12] Yuanzhi Li,et al. Feature Purification: How Adversarial Training Performs Robust Deep Learning , 2020, 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS).
[13] Senthil Mani,et al. Explaining Deep Learning Models using Causal Inference , 2018, ArXiv.
[14] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.
[15] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[16] Cynthia Rudin,et al. The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.
[17] Hanghang Tong,et al. Discerning Edge Influence for Network Embedding , 2019, CIKM.
[18] Markus Strohmaier,et al. The POLAR Framework: Polar Opposites Enable Interpretability of Pre-Trained Word Embeddings , 2020, WWW.
[19] Jan Hendrik Metzen,et al. On Detecting Adversarial Perturbations , 2017, ICLR.
[20] Ruocheng Guo,et al. Causal Interpretability for Machine Learning - Problems, Methods and Evaluation , 2020, SIGKDD Explor..
[21] Brian E. Ruttenberg,et al. Causal Learning and Explanation of Deep Neural Networks via Autoencoded Activations , 2018, ArXiv.
[22] Pascal Vincent,et al. Visualizing Higher-Layer Features of a Deep Network , 2009 .
[23] Xiaoyu Cao,et al. Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification , 2017, ACSAC.
[24] Yanjun Qi,et al. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.
[25] Jure Leskovec,et al. Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..
[26] Deborah Silver,et al. Feature Visualization , 1994, Scientific Visualization.
[27] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.
[28] Cynthia Rudin,et al. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.
[29] F. Keil,et al. Explanation and understanding , 2015 .
[30] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.
[31] T. Lombrozo. The structure and function of explanations , 2006, Trends in Cognitive Sciences.
[32] Sameer Singh,et al. Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods , 2020, AIES.
[33] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.
[34] Eric P. Xing,et al. High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[36] Amit Dhurandhar,et al. Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives , 2018, NeurIPS.
[37] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[38] Michael I. Jordan,et al. ML-LOO: Detecting Adversarial Examples with Feature Attribution , 2019, AAAI.
[39] Dawn Song,et al. Physical Adversarial Examples for Object Detectors , 2018, WOOT @ USENIX Security Symposium.
[40] Peter A. Beling,et al. Adversarial learning in credit card fraud detection , 2017, 2017 Systems and Information Engineering Design Symposium (SIEDS).
[41] Oluwasanmi Koyejo,et al. Examples are not enough, learn to criticize! Criticism for Interpretability , 2016, NIPS.
[42] Hongxia Yang,et al. Is a Single Vector Enough?: Exploring Node Polysemy for Network Embedding , 2019, KDD.
[43] Song Han,et al. Deep Leakage from Gradients , 2019, NeurIPS.
[44] Dan Boneh,et al. Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.
[45] Chiranjib Bhattacharyya,et al. Word2Sense: Sparse Interpretable Word Embeddings , 2019, ACL.
[46] King-Sun Fu,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[47] Michael I. Jordan,et al. Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.
[48] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.
[49] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.
[50] Gang Wang,et al. LEMNA: Explaining Deep Learning based Security Applications , 2018, CCS.
[51] Hanghang Tong,et al. N2N: Network Derivative Mining , 2019, CIKM.
[52] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.
[53] Hiroyuki Shindo,et al. Interpretable Adversarial Perturbation in Input Embedding Space for Text , 2018, IJCAI.
[54] Xin He,et al. Simple Physical Adversarial Examples against End-to-End Autonomous Driving Models , 2019, 2019 IEEE International Conference on Embedded Software and Systems (ICESS).
[55] Klaus-Robert Müller,et al. Explanations can be manipulated and geometry is to blame , 2019, NeurIPS.
[56] Brendan Dolan-Gavitt,et al. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.
[57] Quanshi Zhang,et al. Interpreting CNNs via Decision Trees , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Asaf Shabtai,et al. When Explainability Meets Adversarial Learning: Detecting Adversarial Examples using SHAP Signatures , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).
[59] Stephan Günnemann,et al. Adversarial Attacks on Neural Networks for Graph Data , 2018, KDD.
[60] Zuochang Ye,et al. Detecting Adversarial Perturbations with Saliency , 2018, 2018 IEEE 3rd International Conference on Signal and Image Processing (ICSIP).
[61] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.
[62] Daniel Jurafsky,et al. Understanding Neural Networks through Representation Erasure , 2016, ArXiv.
[63] Ruocheng Guo,et al. A Survey of Learning Causality with Data , 2018, ACM Comput. Surv..
[64] Seyed-Mohsen Moosavi-Dezfooli,et al. Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[65] Ryan R. Curtin,et al. Detecting Adversarial Samples from Artifacts , 2017, ArXiv.
[66] Akshayvarun Subramanya,et al. Fooling Network Interpretation in Image Classification , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[67] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[68] Ananthram Swami,et al. Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.
[69] Patrick D. McDaniel,et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.
[70] Martin Wattenberg,et al. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.
[71] Hang Su,et al. Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples , 2017, ArXiv.
[72] Fan Yang,et al. XFake: Explainable Fake News Detector with Visualizations , 2019, WWW.
[73] Alan L. Yuille,et al. Feature Denoising for Improving Adversarial Robustness , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[74] Xiao Huang,et al. On Interpretation of Network Embedding via Taxonomy Induction , 2018, KDD.
[75] Bernhard Schölkopf,et al. Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .
[76] Andrea Vedaldi,et al. Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[77] Wojciech Samek,et al. Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..
[78] Hongjing Lu,et al. Deep convolutional networks do not classify based on global object shape , 2018, PLoS Comput. Biol..
[79] Toon Goedemé,et al. Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[80] James Zou,et al. Towards Automatic Concept-based Explanations , 2019, NeurIPS.
[81] Ziyan Wu,et al. Counterfactual Visual Explanations , 2019, ICML.
[82] Samy Bengio,et al. Adversarial Machine Learning at Scale , 2016, ICLR.
[83] Aleksander Madry,et al. Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.
[84] Yadong Mu,et al. Informative Dropout for Robust Representation Learning: A Shape-bias Perspective , 2020, ICML.
[85] Chandan Singh,et al. Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.
[86] Xiaolin Hu,et al. Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[87] Isabelle Bichindaritz,et al. Case-based reasoning in the health sciences: What's next? , 2006, Artif. Intell. Medicine.
[88] Qiang Li,et al. Adversarial Training Methods for Network Embedding , 2019, WWW.
[89] Erik Cambria,et al. Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..
[90] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[91] Aleksander Madry,et al. Image Synthesis with a Single (Robust) Classifier , 2019, NeurIPS.
[92] R. Venkatesh Babu,et al. Generalizable Data-Free Objective for Crafting Universal Adversarial Perturbations , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[93] Boris Katz,et al. ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models , 2019, NeurIPS.
[94] Nicholas Carlini,et al. Unrestricted Adversarial Examples , 2018, ArXiv.
[95] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.
[96] Quoc V. Le,et al. Adversarial Examples Improve Image Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[97] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).
[98] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.
[99] Alexandros G. Dimakis,et al. Discrete Adversarial Attacks and Submodular Optimization with Applications to Text Classification , 2018, MLSys.
[100] Patrick D. McDaniel,et al. On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.
[101] C. Hempel,et al. Studies in the Logic of Explanation , 1948, Philosophy of Science.
[102] Aleksander Madry,et al. Robustness May Be at Odds with Accuracy , 2018, ICLR.
[103] Abubakar Abid,et al. Interpretation of Neural Networks is Fragile , 2017, AAAI.
[104] Xia Hu,et al. An Interpretable Classification Framework for Information Extraction from Online Healthcare Forums , 2017, Journal of healthcare engineering.
[105] Hao Chen,et al. MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.
[106] Jingyuan Wang,et al. Interpretability is a Kind of Safety: An Interpreter-based Ensemble for Adversary Defense , 2020, KDD.
[107] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[108] Lujo Bauer,et al. Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.
[109] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[110] Hongxia Yang,et al. Adversarial Detection with Model Interpretation , 2018, KDD.
[111] Bolei Zhou,et al. Interpretable Basis Decomposition for Visual Explanation , 2018, ECCV.
[112] Zhanxing Zhu,et al. Interpreting Adversarially Trained Convolutional Neural Networks , 2019, ICML.
[113] Emily Chen,et al. How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation , 2018, ArXiv.
[114] Alexander Levine,et al. Certifiably Robust Interpretation in Deep Learning , 2019, ArXiv.
[115] Carlos Guestrin,et al. Semantically Equivalent Adversarial Rules for Debugging NLP models , 2018, ACL.
[116] Fan Yang,et al. On Attribution of Recurrent Neural Network Predictions via Additive Decomposition , 2019, WWW.
[117] Chris Russell,et al. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.