Distribution Matching for Rationalization
暂无分享,去创建一个
Yongfeng Huang | Yujun Chen | Zhilin Yang | Yulun Du | Zhilin Yang | Yongfeng Huang | Yujun Chen | Yulun Du
[1] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[2] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.
[3] Dan Klein,et al. Learning to Compose Neural Networks for Question Answering , 2016, NAACL.
[4] Li Fei-Fei,et al. Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[5] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.
[6] Hannaneh Hajishirzi,et al. An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction , 2020, EMNLP.
[7] Regina Barzilay,et al. Rationalizing Neural Predictions , 2016, EMNLP.
[8] Ivan Titov,et al. Interpretable Neural Predictions with Differentiable Binary Variables , 2019, ACL.
[9] Zachary Chase Lipton,et al. Born Again Neural Networks , 2018, ICML.
[10] Byron C. Wallace,et al. ERASER: A Benchmark to Evaluate Rationalized NLP Models , 2020, ACL.
[11] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[12] Edwin Lughofer,et al. Central Moment Discrepancy (CMD) for Domain-Invariant Representation Learning , 2017, ICLR.
[13] Regina Barzilay,et al. Deriving Machine Attention from Human Rationales , 2018, EMNLP.
[14] Tommi S. Jaakkola,et al. Invariant Rationalization , 2020, ICML.
[15] Tommi S. Jaakkola,et al. A Game Theoretic Approach to Class-wise Selective Rationalization , 2019, NeurIPS.
[16] Jure Leskovec,et al. Learning Attitudes and Attributes from Multi-aspect Reviews , 2012, 2012 IEEE 12th International Conference on Data Mining.
[17] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.
[18] Qun Liu,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.
[19] Kate Saenko,et al. Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.
[20] Junmo Kim,et al. A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[22] Mihaela van der Schaar,et al. INVASE: Instance-wise Variable Selection using Neural Networks , 2018, ICLR.
[23] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[24] Kuk-Jin Yoon,et al. Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[25] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[26] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[27] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..
[28] Tommi S. Jaakkola,et al. Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control , 2019, EMNLP.
[29] Richard S. Zemel,et al. Generative Moment Matching Networks , 2015, ICML.
[30] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[31] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[32] Le Song,et al. Learning to Explain: An Information-Theoretic Perspective on Model Interpretation , 2018, ICML.
[33] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.