Self-Reinforcement Attention Mechanism For Tabular Learning

Apart from the high accuracy of machine learning models, what interests many researchers in real-life problems (e.g., fraud detection, credit scoring) is to find hidden patterns in data; particularly when dealing with their challenging imbalanced characteristics. Interpretability is also a key requirement that needs to accompany the used machine learning model. In this concern, often, intrinsically interpretable models are preferred to complex ones, which are in most cases black-box models. Also, linear models are used in some high-risk fields to handle tabular data, even if performance must be sacrificed. In this paper, we introduce Self-Reinforcement Attention (SRA), a novel attention mechanism that provides a relevance of features as a weight vector which is used to learn an intelligible representation. This weight is then used to reinforce or reduce some components of the raw input through element-wise vector multiplication. Our results on synthetic and real-world imbalanced data show that our proposed SRA block is effective in end-to-end combination with baseline models.

[1]  Hanene Azzag,et al.  Epigenetics Algorithms: Self-Reinforcement-Attention mechanism to regulate chromosomes expression , 2023, ArXiv.

[2]  Joao Marques-Silva,et al.  The Inadequacy of Shapley Values for Explainability , 2023, ArXiv.

[3]  T. Hastie,et al.  RbX: Region-based explanations of prediction models , 2022, ArXiv.

[4]  Artem Babenko,et al.  Revisiting Deep Learning Models for Tabular Data , 2021, NeurIPS.

[5]  Salim I. Amoukou,et al.  Accurate Shapley Values for explaining tree-based models , 2021, AISTATS.

[6]  Aidan N. Gomez,et al.  Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning , 2021, NeurIPS.

[7]  R. Caruana,et al.  NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning , 2021, ICLR.

[8]  Micah Goldblum,et al.  SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training , 2021, ArXiv.

[9]  Xin Huang,et al.  TabTransformer: Tabular Data Modeling Using Contextual Embeddings , 2020, ArXiv.

[10]  Geoffrey E. Hinton,et al.  Neural Additive Models: Interpretable Machine Learning with Neural Nets , 2020, NeurIPS.

[11]  Sorelle A. Friedler,et al.  Problems with Shapley-value-based explanations as feature importance measures , 2020, ICML.

[12]  Hugh Chen,et al.  From local explanations to global understanding with explainable AI for trees , 2020, Nature Machine Intelligence.

[13]  Sergei Popov,et al.  Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data , 2019, ICLR.

[14]  Sercan Ö. Arik,et al.  TabNet: Attentive Interpretable Tabular Learning , 2019, AAAI.

[15]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[16]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[17]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[18]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[19]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[20]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[21]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.

[22]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..