Cost-effective Interactive Attention Learning with Neural Attention Processes

We propose a novel interactive learning framework which we refer to as Interactive Attention Learning (IAL), in which the human supervisors interactively manipulate the allocated attentions, to correct the model's behavior by updating the attention-generating network. However, such a model is prone to overfitting due to scarcity of human annotations, and requires costly retraining. Moreover, it is almost infeasible for the human annotators to examine attentions on tons of instances and features. We tackle these challenges by proposing a sample-efficient attention mechanism and a cost-effective reranking algorithm for instances and features. First, we propose Neural Attention Process (NAP), which is an attention generator that can update its behavior by incorporating new attention-level supervisions without any retraining. Secondly, we propose an algorithm which prioritizes the instances and the features by their negative impacts, such that the model can yield large improvements with minimal human feedback. We validate IAL on various time-series datasets from multiple domains (healthcare, real-estate, and computer vision) on which it significantly outperforms baselines with conventional attention mechanisms, or without cost-effective reranking, with substantially less retraining and human-model interaction cost.

[1]  M. Province,et al.  Familial history of stroke and stroke risk. The Family Heart Study. , 1997, Stroke.

[2]  Jürgen Rehm,et al.  Irregular heavy drinking occasions and risk of ischemic heart disease: a systematic review and meta-analysis. , 2010, American journal of epidemiology.

[3]  S. Yusuf,et al.  Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): case-control study , 2004, The Lancet.

[4]  L. Appel,et al.  Smoking and atherosclerotic cardiovascular disease in men with low levels of serum cholesterol: the Korea Medical Insurance Corporation Study. , 1999, JAMA.

[5]  Henrik Vestergaard,et al.  The independent effect of type 2 diabetes mellitus on ischemic heart disease, stroke, and death: a population-based study of 13,000 men and women with 20 years of follow-up. , 2004, Archives of internal medicine.

[6]  B Rosner,et al.  Smoking cessation and decreased risk of stroke in women. , 1993, JAMA.

[7]  T. Erlinger,et al.  Blood pressure change and risk of hypertension associated with parental hypertension: the Johns Hopkins Precursors Study. , 2008, Archives of internal medicine.

[8]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[9]  Jaegul Choo,et al.  AILA: Attentive Interactive Labeling Assistant for Document Classification through Attention-Based Deep Neural Networks , 2019, CHI.

[10]  Samuel J. Gershman,et al.  Human-in-the-Loop Interpretability Prior , 2018, NeurIPS.

[11]  Daphne Koller,et al.  Active learning: theory and applications , 2001 .

[12]  A. Dyer,et al.  Impact of major cardiovascular disease risk factors, particularly in combination, on 22-year mortality in women and men. , 1998, Archives of internal medicine.

[13]  D L McGee,et al.  Diabetes and Glucose Tolerance as Risk Factors for Cardiovascular Disease: The Framingham Study , 1979, Diabetes Care.

[14]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[15]  P. Allhoff,et al.  The Framingham Offspring Study , 1991 .

[16]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ian Clark,et al.  Toward a neural basis for peer-interaction: what makes peer-learning tick? , 2015, Front. Psychol..

[18]  Eunho Yang,et al.  Uncertainty-Aware Attention for Reliable Interpretation and Prediction , 2018, NeurIPS.

[19]  E. Bonora,et al.  Risk factors for coronary artery disease in healthy persons with hyperinsulinemia and normal glucose tolerance. , 1989, The New England journal of medicine.

[20]  Geoffrey Cloud,et al.  Evaluating the Genetic Component of Ischemic Stroke Subtypes: A Family History Study , 2003, Stroke.

[21]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[22]  F. Sánchez-Jiménez,et al.  The obesity paradox. , 2023, Medicina clinica.

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  Ralph B D'Agostino,et al.  Parental atrial fibrillation as a risk factor for atrial fibrillation in offspring. , 2004, JAMA.

[25]  Dhruv Batra,et al.  Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions? , 2016, EMNLP.

[26]  Mohammad Hosein Farzaei,et al.  Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017 , 2018, Lancet.

[27]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[28]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[29]  Stephan Rössner,et al.  Prevention Conference VII: Obesity, a worldwide epidemic related to heart disease and stroke: executive summary. , 2004, Circulation.

[30]  Steven Salzberg,et al.  Programs for Machine Learning , 2004 .

[31]  T. Olsen,et al.  Effect of blood pressure and diabetes on stroke in progression , 1994, The Lancet.

[32]  Lalana Kagal,et al.  Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  H. Tsukimoto,et al.  Rule extraction from neural networks via decision tree induction , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[35]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Michael J. Pencina,et al.  Quantifying Importance of Major Risk Factors for Coronary Heart Disease , 2018, Circulation.

[37]  Alexander Wong,et al.  SISC: End-to-End Interpretable Discovery Radiomics-Driven Lung Cancer Prediction via Stacked Interpretable Sequencing Cells , 2019, IEEE Access.

[38]  S. Juvela,et al.  Recent heavy drinking of alcohol and embolic stroke. , 1999, Stroke.

[39]  P. Vliet National Clinical Guideline for Stroke , 2010 .

[40]  P. Wilson,et al.  Parental transmission of type 2 diabetes: the Framingham Offspring Study. , 2000, Diabetes.

[41]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[42]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[43]  Jay S Kaufman,et al.  The "obesity paradox" explained. , 2013, Epidemiology.

[44]  Steven Shea,et al.  Basic vs More Complex Definitions of Family History in the Prediction of Coronary Heart Disease: The Multi‐Ethnic Study of Atherosclerosis , 2018, Mayo Clinic proceedings.

[45]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[46]  H. S. Jørgensen,et al.  The Influence of Age on Stroke Outcome: The Copenhagen Stroke Study , 1994, Stroke.

[47]  A C Novello,et al.  Surgeon General's report on the health benefits of smoking cessation. , 1990, Public health reports.

[48]  K Potempa,et al.  Physiological outcomes of aerobic exercise training in hemiparetic stroke patients. , 1995, Stroke.

[49]  Efstathios Manios,et al.  Association Between Obesity and Mortality After Acute First-Ever Stroke: The Obesity–Stroke Paradox , 2011, Stroke.

[50]  M. Lauer,et al.  A propensity analysis of cigarette smoking and mortality with consideration of the effects of alcohol. , 2001, The American journal of cardiology.

[51]  A. Gotto,et al.  Primary prevention of acute coronary events with lovastatin in men and women with average cholesterol levels: results of AFCAPS/TexCAPS. Air Force/Texas Coronary Atherosclerosis Prevention Study. , 1998, JAMA.

[52]  Jeff Donahue,et al.  Annotator rationales for visual recognition , 2011, 2011 International Conference on Computer Vision.

[53]  Jason Eisner,et al.  Modeling Annotators: A Generative Approach to Learning from Annotator Rationales , 2008, EMNLP.

[54]  Louise D McCullough,et al.  Age and Sex Are Critical Factors in Ischemic Stroke Pathology. , 2018, Endocrinology.

[55]  Frank Kee,et al.  Patterns of alcohol consumption and ischaemic heart disease in culturally divergent countries: the Prospective Epidemiological Study of Myocardial Infarction (PRIME) , 2010, BMJ : British Medical Journal.

[56]  G GUIMARAES,et al.  Essential hypertension , 1950, Revue de medecine aeronautique.

[57]  Ralph B D'Agostino,et al.  Metabolic syndrome compared with type 2 diabetes mellitus as a risk factor for stroke: the Framingham Offspring Study. , 2006, Archives of internal medicine.

[58]  Yadong Mu,et al.  Deep Steering: Learning End-to-End Driving Model from Spatial and Temporal Visual Cues , 2017, ArXiv.

[59]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[60]  M. Thun,et al.  Body-Mass Index and Mortality in a Prospective Cohort of US Adults , 2000 .

[61]  S. Larsson,et al.  Role of Blood Lipids in the Development of Ischemic Stroke and its Subtypes , 2018, Stroke.

[62]  Steven L. Salzberg,et al.  Book Review: C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993 , 1994, Machine Learning.

[63]  S. Weisberg,et al.  Characterizations of an Empirical Influence Function for Detecting Influential Cases in Regression , 1980 .

[64]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[65]  R J Glynn,et al.  C-reactive protein adds to the predictive value of total and HDL cholesterol in determining risk of first myocardial infarction. , 1998, Circulation.

[66]  N J Wareham,et al.  The link between family history and risk of type 2 diabetes is not explained by anthropometric, lifestyle or genetic risk factors: the EPIC-InterAct study , 2012, Diabetologia.

[67]  Silvio Savarese,et al.  Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[68]  Jennifer G. Robinson,et al.  2013 ACC/AHA guideline on the treatment of blood cholesterol to reduce atherosclerotic cardiovascular risk in adults: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. , 2014, Circulation.

[69]  Jennifer G. Robinson,et al.  Reprint: 2013 ACC/AHA Guideline on the Treatment of Blood Cholesterol to Reduce Atherosclerotic Cardiovascular Risk in Adults. , 2013, Journal of the American Pharmacists Association : JAPhA.

[70]  Sharon E Straus,et al.  Stroke: strategies for primary prevention. , 2003, American family physician.

[71]  Yee Whye Teh,et al.  Neural Processes , 2018, ArXiv.

[72]  Buring,et al.  Cardiovascular Disease in Women: Clinical Perspectives. , 2016, Circulation research.

[73]  Mark Woodward,et al.  Cigarette smoking as a risk factor for coronary heart disease in women compared with men: a systematic review and meta-analysis of prospective cohort studies , 2011, The Lancet.

[74]  I. Janszky,et al.  Alcohol consumption is associated with a lower incidence of acute myocardial infarction: results from a large prospective population‐based study in Norway , 2016, Journal of internal medicine.

[75]  K. Anderson,et al.  Cardiovascular disease risk profiles. , 1991, American heart journal.

[76]  Jonghyun Choi,et al.  Knowledge Transfer with Interactive Learning of Semantic Relationships , 2016, AAAI.

[77]  R. Rosenfeld Patients , 2012, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[78]  E. Vartiainen,et al.  Sex, age, cardiovascular risk factors, and coronary heart disease: a prospective follow-up study of 14 786 middle-aged men and women in Finland. , 1999, Circulation.

[79]  J. Cole,et al.  Smoking and stroke: the more you smoke the more you stroke , 2010, Expert review of cardiovascular therapy.

[80]  Bolei Zhou,et al.  Visualizing and Understanding Generative Adversarial Networks (Extended Abstract) , 2019, ArXiv.

[81]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[82]  A. Folsom,et al.  Physical Activity and Stroke Risk: A Meta-Analysis , 2003, Stroke.

[83]  Ankur Teredesai,et al.  Interpretable Machine Learning in Healthcare , 2018, 2018 IEEE International Conference on Healthcare Informatics (ICHI).

[84]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[85]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[86]  C. Caspersen,et al.  Physical activity and the incidence of coronary heart disease. , 1987, Annual review of public health.

[87]  A. Nissinen,et al.  Mortality from all causes and from coronary heart disease related to smoking and changes in smoking during a 35-year follow-up of middle-aged Finnish men. , 2000, European heart journal.

[88]  Stanislav Sobolevsky,et al.  House Price Modeling with Digital Census , 2018, 1809.03834.