Adversarial Invariant Feature Learning

Learning meaningful representations that maintain the content necessary for a particular task while filtering away detrimental variations is a problem of great interest in machine learning. In this paper, we tackle the problem of learning representations invariant to a specific factor or trait of data, leading to better generalization. The representation learning process is formulated as an adversarial minimax game. We analyze the optimal equilibrium of such a game. On three benchmark tasks, namely fair classifications that are bias-free, language-independent generation, and lighting-independent image classification, we show that the proposed framework induces an invariant representation, and leads to better generalization evidenced by the improved test performance.