论文信息 - Decision-Directed Data Decomposition

Decision-Directed Data Decomposition

We present an algorithm, Decision-Directed Data Decomposition (D4), which decomposes a dataset into two components. The first contains most of the useful information for a specified supervised learning task. The second orthogonal component contains little information about the task but retains associations and information that were not targeted. The algorithm is simple and scalable. We illustrate its application in image and text processing domains. Our results show that 1) post-hoc application of D4 to an image representation space can remove information about specified concepts without impacting other concepts, 2) D4 is able to improve predictive generalization in certain settings, and 3) applying D4 to word embedding representations produces state-of-the-art results in debiasing.

Daniel J. Lizotte | Brent D. Davis | Ethan Jackson

[1] John B. Goodenough,et al. Contextual correlates of synonymy , 1965, CACM.

[2] Yoav Goldberg,et al. Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them , 2019, NAACL-HLT.

[3] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .

[4] Jorge Cadima,et al. Principal component analysis: a review and recent developments , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[5] Joao Sedoc,et al. Conceptor Debiasing of Word Representations Evaluated on WEAT , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.

[6] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[7] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8] Christopher D. Manning,et al. Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[9] Gemma Boleda,et al. Distributional Semantics in Technicolor , 2012, ACL.

[10] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[11] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[12] Zeyu Li,et al. Learning Gender-Neutral Word Embeddings , 2018, EMNLP.

[13] Luc Van Gool,et al. Deep Expectation of Real and Apparent Age from a Single Image Without Facial Landmarks , 2016, International Journal of Computer Vision.

[14] J. Platt. Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[15] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[16] Sameer Singh,et al. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.