Visual vs internal attention mechanisms in deep neural networks for image classification and object detection

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  C. Constantinidis,et al.  Bottom-Up and Top-Down Attention , 2014, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[3]  Bingbing Ni,et al.  Learning Multi-Attention Context Graph for Group-Based Re-Identification , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[7]  Dietmar Heinke,et al.  Excitatory versus inhibitory feedback in Bayesian formulations of scene construction , 2019, Journal of the Royal Society Interface.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[10]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[11]  Roohie Naaz Mir,et al.  SSFNET-VOS: Semantic segmentation and fusion network for video object segmentation , 2020, Pattern Recognit. Lett..

[12]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[13]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[14]  Junzhong Ji,et al.  Divergent-convergent attention for image captioning , 2021, Pattern Recognit..

[15]  Haibin Ling,et al.  Revisiting Video Saliency Prediction in the Deep Learning Era , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Dietmar Heinke,et al.  Modelling Visual Search with the Selective Attention for Identification Model (VS-SAIM): A Novel Explanation for Visual Search Asymmetries , 2010, Cognitive Computation.

[17]  Valérie Gouet-Brunet,et al.  Saliency and Burstiness for Feature Selection in CBIR , 2019, 2019 8th European Workshop on Visual Information Processing (EUVIP).

[18]  Jeremy M Wolfe,et al.  Visual Attention , 2020, Computational Models for Cognitive Vision.

[19]  S. B. Hutton,et al.  Eye Tracking Methodology , 2019, Eye Movement Research.

[20]  Shuicheng Yan,et al.  A2-Nets: Double Attention Networks , 2018, NeurIPS.

[21]  Abraham Montoya Obeso,et al.  Image annotation for Mexican buildings database , 2016, Optical Engineering + Applications.

[22]  Yan Song,et al.  GRNet: Graph-based remodeling network for multi-view semi-supervised classification , 2021, Pattern Recognit. Lett..

[23]  Qi Zhao,et al.  SALICON: Saliency in Context , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Jenny Benois-Pineau,et al.  Perceptual modeling in the problem of active object recognition in visual scenes , 2016, Pattern Recognit..

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[28]  C. Eriksen,et al.  Temporal and spatial characteristics of selective encoding from visual displays , 1972 .

[29]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Jia Li,et al.  A Benchmark Dataset and Saliency-Guided Stacked Autoencoders for Video-Based Salient Object Detection. , 2018, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[31]  David S Wooding,et al.  Eye movements of large populations: II. Deriving regions of interest, coverage, and similarity using fixation maps , 2002, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[32]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  Chengdong Wu,et al.  Saliency detection via integrating deep learning architecture and low-level features , 2019, Neurocomputing.

[34]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[35]  Chokri Ben Amar,et al.  ChaboNet : Design of a deep CNN for prediction of visual saliency in natural video , 2019, J. Vis. Commun. Image Represent..

[36]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[37]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Jenny Benois-Pineau,et al.  Comparative study of visual saliency maps in the problem of classification of architectural images with Deep CNNs , 2018, 2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA).

[39]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[40]  Ali Borji,et al.  Salient object detection: A survey , 2014, Computational Visual Media.

[41]  Yuejie Zhang,et al.  Balanced single-shot object detection using cross-context attention-guided network , 2022, Pattern Recognit..

[42]  Jenny Benois-Pineau,et al.  Architectural style classification of Mexican historical buildings using deep convolutional neural networks and sparse features , 2016, J. Electronic Imaging.