State Space Model for New-Generation Network Alternative to Transformers: A Survey
暂无分享,去创建一个
Yaowei Wang | Bowei Jiang | Yao Rong | Jin Tang | Wentao Wu | Chenglong Li | Xiao Wang | Shiao Wang | Yonghong Tian | Ju Huang | Yuhe Ding | Yuehang Li | Weizhe Kong | Shihao Li | Haoxiang Yang | Ziwen Wang
[1] Fusheng Liu,et al. From Generalization Analysis to Optimization Designs for State Space Models , 2024, 2405.02670.
[2] Étienne David,et al. Variational quantization for state space models , 2024, ArXiv.
[3] S. Chaudhuri,et al. Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos , 2024, ArXiv.
[4] Chenhao Ying,et al. DGMamba: Domain Generalization via Generalized State Space Model , 2024, ArXiv.
[5] Xiangyu Zhu,et al. FusionMamba: Efficient Image Fusion with State Space Model , 2024, ArXiv.
[6] Anwai Archit,et al. ViM-UNet: Vision Mamba for Biomedical Segmentation , 2024, ArXiv.
[7] Yixuan Li,et al. 3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion , 2024, ArXiv.
[8] Bochao Zou,et al. RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos , 2024, ArXiv.
[9] Zhenye Gan,et al. MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection , 2024, ArXiv.
[10] Weidong Yang,et al. 3DMambaIPF: A State Space Model for Iterative Point Cloud Filtering via Differentiable Rendering , 2024, ArXiv.
[11] Zhengcong Fei,et al. Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models , 2024, ArXiv.
[12] Simon Stepputtis,et al. Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation , 2024, ArXiv.
[13] Hongruixuan Chen,et al. ChangeMamba: Remote Sensing Change Detection With Spatiotemporal State Space Model , 2024, IEEE Transactions on Geoscience and Remote Sensing.
[14] Arnab Sen Sharma,et al. Locating and Editing Factual Associations in Mamba , 2024, ArXiv.
[15] Man-On Pun,et al. RS3Mamba: Visual State Space Model for Remote Sensing Image Semantic Segmentation , 2024, IEEE Geoscience and Remote Sensing Letters.
[16] P. Xiao,et al. RS-Mamba for Large Remote Sensing Image Dense Prediction , 2024, IEEE Transactions on Geoscience and Remote Sensing.
[17] Kai Li,et al. SPMamba: State-space model is all you need in speech separation , 2024, ArXiv.
[18] Yuanzhi Cai,et al. Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model , 2024, ArXiv.
[19] E. J. Olucha,et al. On the reduction of Linear Parameter-Varying State-Space models , 2024, ArXiv.
[20] Jing Hao,et al. T-Mamba: Frequency-Enhanced Gated Long-Range Dependency for Tooth 3D CBCT Segmentation , 2024, ArXiv.
[21] Xiaopeng Fan,et al. SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding , 2024, ArXiv.
[22] Judy X Yang,et al. HSIMamba: Hyperpsectral Imaging Efficient Feature Learning with Bidirectional State Space for Classification , 2024, ArXiv.
[23] Toshihiro Ota. Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State Spaces , 2024, ArXiv.
[24] Tao Zhu,et al. HARMamba: Efficient Wearable Sensor Human Activity Recognition Based on Bidirectional Selective SSM , 2024, ArXiv.
[25] Ali Behrouz,et al. MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection , 2024, ArXiv.
[26] Pengchen Liang,et al. UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation , 2024, ArXiv.
[27] Y. Shoham,et al. Jamba: A Hybrid Transformer-Mamba Language Model , 2024, ArXiv.
[28] Zhichao Xu. RankMamba: Benchmarking Mamba's Document Ranking Performance in the Era of Transformers , 2024, 2403.18276.
[29] Xinchao Wang,et al. Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction , 2024, ArXiv.
[30] N. Mesgarani,et al. Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation , 2024, ArXiv.
[31] Chenhongyi Yang,et al. PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition , 2024, ArXiv.
[32] Hao Tang,et al. Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation , 2024, ArXiv.
[33] M. Soltanolkotabi,et al. Serpent: Scalable and Efficient Image Restoration via Multi-scale Structured State Space Models , 2024, ArXiv.
[34] Jiangchao Yao,et al. ReMamber: Referring Image Segmentation with Mamba Twister , 2024, ArXiv.
[35] Pragaash Ponnusamy,et al. Mechanistic Design and Scaling of Hybrid Architectures , 2024, ArXiv.
[36] Md. Tanzim Hossain,et al. Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis Legion , 2024, ArXiv.
[37] Franccois Pomerleau,et al. Proprioception Is All You Need: Terrain Classification for Boreal Forests , 2024, ArXiv.
[38] Zhenheng Tang,et al. VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting , 2024, ArXiv.
[39] Guangqian Yang,et al. CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classification , 2024, ArXiv.
[40] Zhumin Chen,et al. Uncovering Selective State Space Model's Capabilities in Lifelong Sequential Recommendation , 2024, ArXiv.
[41] M. Zeilinger,et al. State Space Models as Foundation Models: A Control Theoretic Overview , 2024, ArXiv.
[42] Hanzhi Yin,et al. Modeling Analog Dynamic Range Compressors using Deep Learning and State-space Models , 2024, ArXiv.
[43] André Rosa de Sousa Porfírio Correia,et al. Music to Dance as Language Translation using Sequence Models , 2024, ArXiv.
[44] B. N. Patro,et al. SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series , 2024, ArXiv.
[45] Siteng Huang,et al. Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference , 2024, ArXiv.
[46] Manas Mejari,et al. Model order reduction of deep structured state-space models: A system-theoretic approach , 2024, ArXiv.
[47] Bjorn Ommer,et al. ZigMa: A DiT-style Zigzag Mamba Diffusion Model , 2024, ArXiv.
[48] Guibo Luo,et al. ProMamba: Prompt-Mamba for polyp segmentation , 2024, ArXiv.
[49] Zijia Zhao,et al. VL-Mamba: Exploring State Space Models for Multimodal Learning , 2024, ArXiv.
[50] Pengchen Liang,et al. H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation , 2024, ArXiv.
[51] A. Coster,et al. STG-Mamba: Spatial-Temporal Graph Learning via Selective State Space Model , 2024, ArXiv.
[52] Xuefeng Xiao,et al. VmambaIR: Visual State Space Model for Image Restoration , 2024, ArXiv.
[53] Daling Wang,et al. Is Mamba Effective for Time Series Forecasting? , 2024, ArXiv.
[54] Yanxi Li,et al. Understanding Robustness of Visual State Space Models for Image Classification , 2024, ArXiv.
[55] Xiaohuan Pei,et al. EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba , 2024, ArXiv.
[56] Zhidi Lin,et al. Regularization-Based Efficient Continual Learning in Deep State-Space Models , 2024, ArXiv.
[57] Shan You,et al. LocalMamba: Visual State Space Model with Windowed Selective Scan , 2024, ArXiv.
[58] MingYa Zhang,et al. VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation , 2024, ArXiv.
[59] Md. Atik Ahamed,et al. TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting , 2024, ArXiv.
[60] Zhiqi Li,et al. Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding , 2024, ArXiv.
[61] Zunnan Xu,et al. MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models , 2024, ArXiv.
[62] Hang Wang,et al. Activating Wider Areas in Image Super-Resolution , 2024, ArXiv.
[63] Changsheng Quan,et al. Multichannel Long-Term Streaming Neural Speech Enhancement for Static and Moving Speakers , 2024, IEEE Signal Processing Letters.
[64] Jintai Chen,et al. Large Window-based Mamba UNet for Medical Image Segmentation: Beyond Convolution and Self-attention , 2024, ArXiv.
[65] Yali Wang,et al. VideoMamba: State Space Model for Efficient Video Understanding , 2024, ArXiv.
[66] Vaishnavh Nagarajan,et al. The pitfalls of next-token prediction , 2024, ICML.
[67] Yu Zheng,et al. Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy , 2024, ArXiv.
[68] Shu Yang,et al. MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology , 2024, ArXiv.
[69] A. Bihorac,et al. A multi-cohort study on prediction of acute brain dysfunction states using selective state space models , 2024, ArXiv.
[70] Avijit Mitra,et al. ClinicalMamba: A Generative Clinical Language Model on Longitudinal Clinical Notes , 2024, CLINICALNLP.
[71] Bowei Jiang,et al. Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline , 2024, ArXiv.
[72] Shing Shin Cheng,et al. Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy , 2024, ArXiv.
[73] Yinghao Zhu,et al. LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation , 2024, ArXiv.
[74] Zijie Fang,et al. MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models , 2024, ArXiv.
[75] Mohammad Reza Samsami,et al. Mastering Memory Tasks with World Models , 2024, ArXiv.
[76] James Caverlee,et al. Mamba4Rec: Towards Efficient Sequential Recommendation with Selective State Space Models , 2024, ArXiv.
[77] Yubiao Yue,et al. MedMamba: Vision Mamba for Medical Image Classification , 2024, ArXiv.
[78] Yair Schiff,et al. Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling , 2024, ICML.
[79] Jifeng Dai,et al. Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures , 2024, ArXiv.
[80] Zhentao Tan,et al. MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection , 2024, ArXiv.
[81] Ameen Ali,et al. The Hidden Attention of Mamba Models , 2024, ArXiv.
[82] Haobo Yuan,et al. Point Cloud Mamba: Point Cloud Learning via State Space Model , 2024, ArXiv.
[83] Zhuangwei Shi. MambaStock: Selective state space model for stock prediction , 2024, ArXiv.
[84] Antonio Orvieto,et al. Theoretical Foundations of Deep Selective State-Space Models , 2024, ArXiv.
[85] Angelica I. Avilés-Rivero,et al. MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation , 2024, ArXiv.
[86] Yehui Tang,et al. DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models , 2024, ArXiv.
[87] Chi-Sheng Chen,et al. Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning , 2024, ArXiv.
[88] Zhihao Ouyang,et al. MambaIR: A Simple Baseline for Image Restoration with State-Space Model , 2024, ArXiv.
[89] Mathias Gehrig,et al. State Space Models for Event Cameras , 2024, ArXiv.
[90] K. Yan,et al. Pan-Mamba: Effective pan-sharpening with State Space Model , 2024, ArXiv.
[91] Ziqi Zhu,et al. TLS-RWKV: Real-Time Online Action Detection with Temporal Label Smoothing , 2024, Neural Process. Lett..
[92] Ziyang Wang,et al. Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation , 2024, ArXiv.
[93] Dingkang Liang,et al. PointMamba: A Simple State Space Model for Point Cloud Analysis , 2024, ArXiv.
[94] Raunaq M. Bhirangi,et al. Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling , 2024, ArXiv.
[95] Ali Behrouz,et al. Graph Mamba: Towards Learning on Graphs with State Space Models , 2024, ArXiv.
[96] Guanxi Li,et al. P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation , 2024, ArXiv.
[97] Zhuoran Zheng,et al. FD-Vision Mamba for Endoscopic Exposure Correction , 2024, ArXiv.
[98] Shufan Li,et al. Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data , 2024, ArXiv.
[99] Zhengcong Fei,et al. Scalable Diffusion Models with State Space Backbone , 2024, ArXiv.
[100] Ziyang Wang,et al. Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation , 2024, ArXiv.
[101] Dimitris Papailiopoulos,et al. Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks , 2024, ArXiv.
[102] Hao Yang,et al. Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining , 2024, ArXiv.
[103] Julien N. Siems,et al. Is Mamba Capable of In-Context Learning? , 2024, ArXiv.
[104] Haifan Gong,et al. nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model , 2024, ArXiv.
[105] Jiacheng Ruan,et al. VM-UNet: Vision Mamba UNet for Medical Image Segmentation , 2024, ArXiv.
[106] Junlong Du,et al. Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey , 2024, ArXiv.
[107] Chloe X. Wang,et al. Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces , 2024, ArXiv.
[108] Mathieu Ravaut,et al. LOCOST: State-Space Models for Long Document Abstractive Summarization , 2024, EACL.
[109] Yijun Yang,et al. Vivim: a Video Vision Mamba for Medical Video Object Segmentation , 2024, ArXiv.
[110] Yijun Yang,et al. SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation , 2024, ArXiv.
[111] Yunjie Tian,et al. VMamba: Visual State Space Model , 2024, ArXiv.
[112] Bencheng Liao,et al. Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model , 2024, ArXiv.
[113] Haowen Hou,et al. RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks , 2024, ArXiv.
[114] Jun Ma,et al. U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation , 2024, ArXiv.
[115] Devendra Singh Chaplot,et al. Mixtral of Experts , 2024, ArXiv.
[116] Sebastian Jaszczur,et al. MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts , 2024, ArXiv.
[117] Xiao Wang,et al. Pedestrian Attribute Recognition via CLIP based Prompt Vision-Language Fusion , 2023, ArXiv.
[118] Xiao Wang,et al. Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception , 2023, AAAI.
[119] Elad Hazan,et al. Spectral State Space Models , 2023, ArXiv.
[120] Carl R. Andersson,et al. Structured state-space models are deep Wiener models , 2023, ArXiv.
[121] R. Panda,et al. Gated Linear Attention Transformers with Hardware-Efficient Training , 2023, ArXiv.
[122] Chenglong Li,et al. SequencePAR: Understanding Pedestrian Attributes via A Sequence Generation Paradigm , 2023, ArXiv.
[123] Antonio Orvieto,et al. Recurrent Distance Filtering for Graph Representation Learning , 2023, 2312.01538.
[124] Albert Gu,et al. Mamba: Linear-Time Sequence Modeling with Selective State Spaces , 2023, ArXiv.
[125] Jing Nathan Yan,et al. Diffusion Models Without Attention , 2023, ArXiv.
[126] Shida Wang,et al. StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization , 2023, ArXiv.
[127] Hermann Kumbong,et al. FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores , 2023, ArXiv.
[128] Tobias Katsch. GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling , 2023, ArXiv.
[129] Scott W. Linderman,et al. Convolutional State Space Models for Long-Range Spatiotemporal Modeling , 2023, NeurIPS.
[130] Y. Bengio,et al. Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions , 2023, NeurIPS.
[131] R. Herbrich,et al. Hieros: Hierarchical Imagination on Structured State Space Sequence World Models , 2023, ArXiv.
[132] Jonathan Berant,et al. Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors , 2023, ArXiv.
[133] N. Benjamin Erichson,et al. Robustifying State-space Models for Long Sequences via Approximate Diagonalization , 2023, ArXiv.
[134] Lin Zhu,et al. Event Stream-based Visual Object Tracking: A High-Resolution Benchmark Dataset and A Novel Baseline , 2023, ArXiv.
[135] Beichen Xue,et al. State-space Models with Layer-wise Nonlinearity are Universal Approximators with Exponential Decaying Memory , 2023, NeurIPS.
[136] Yu Du,et al. Spiking Structured State Space Model for Monaural Speech Enhancement , 2023, ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[137] J. Oswald,et al. Gated recurrent neural networks discover attention , 2023, ArXiv.
[138] Gao Huang,et al. FLatten Transformer: Vision Transformer using Focused Linear Attention , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[139] Li Dong,et al. Retentive Network: A Successor to Transformer for Large Language Models , 2023, ArXiv.
[140] Quentin G. Anthony,et al. RWKV: Reinventing RNNs for the Transformer Era , 2023, EMNLP.
[141] Pichao Wang,et al. Selective Structured State-Spaces for Long-Form Video Understanding , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[142] Xiaojun Chang,et al. Dynamic Graph Enhanced Contrastive Learning for Chest X-Ray Report Generation , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[143] Henrique Pondé de Oliveira Pinto,et al. GPT-4 Technical Report , 2023, 2303.08774.
[144] Caglar Gulcehre,et al. Resurrecting Recurrent Neural Networks for Long Sequences , 2023, ICML.
[145] Chris Xiaoxuan Lu,et al. Structured State Space Models for In-Context Reinforcement Learning , 2023, NeurIPS.
[146] Yonghong Tian,et al. Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey , 2023, Machine Intelligence Research.
[147] Jimmy Ba,et al. Mastering Diverse Domains through World Models , 2023, ArXiv.
[148] Khaled Kamal Saab,et al. Hungry Hungry Hippos: Towards Language Modeling with State Space Models , 2022, ICLR.
[149] Alexander M. Rush,et al. Pretraining Without Attention , 2022, EMNLP.
[150] Denis Xavier Charles,et al. Efficient Long Sequence Modeling via State Space Augmented Transformer , 2022, ArXiv.
[151] Yonghong Tian,et al. Revisiting Color-Event based Tracking: A Unified Network, Dataset, and Metric , 2022, ArXiv.
[152] Qian Wang,et al. A Simple Visual-Textual Baseline for Pedestrian Attribute Recognition , 2022, IEEE Transactions on Circuits and Systems for Video Technology.
[153] Luke Zettlemoyer,et al. Mega: Moving Average Equipped Gated Attention , 2022, ICLR.
[154] Shinji Watanabe,et al. TF-GRIDNET: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation , 2022, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[155] Scott W. Linderman,et al. Simplified State Space Layers for Sequence Modeling , 2022, ICLR.
[156] Junsong Yuan,et al. AiATrack: Attention in Attention for Transformer Visual Tracking , 2022, ECCV.
[157] Kaiqi Huang,et al. Learning Disentangled Attribute Representations for Robust Pedestrian Attribute Recognition , 2022, AAAI.
[158] Behnam Neyshabur,et al. Long Range Language Modeling via Gated State Spaces , 2022, ICLR.
[159] Christopher Ré,et al. How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections , 2022, ICLR.
[160] Shen Ge,et al. Competence-based Multimodal Curriculum Learning for Medical Report Generation , 2022, ACL.
[161] Christopher Ré,et al. On the Parameterization and Initialization of Diagonal State Space Models , 2022, NeurIPS.
[162] Mingsheng Shang,et al. MCFL: multi-label contrastive focal loss for deep imbalanced pedestrian attribute recognition , 2022, Neural Computing and Applications.
[163] Junyi Wu,et al. Inter-Attribute awareness for pedestrian attribute recognition , 2022, Pattern Recognit..
[164] Zengming Tang,et al. DRFormer: Learning dual relations using Transformer for pedestrian attribute recognition , 2022, Neurocomputing.
[165] Md. Mohaiminul Islam,et al. Long Movie Clip Classification with State-Space Video Models , 2022, ECCV.
[166] Jonathan Berant,et al. Diagonal State Spaces are as Effective as Structured State Spaces , 2022, NeurIPS.
[167] S. Shan,et al. Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework , 2022, ECCV.
[168] Limin Wang,et al. MixFormer: End-to-End Tracking with Iterative Mixed Attention , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[169] L. Gool,et al. Transforming Model Prediction for Tracking , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[170] Wanli Ouyang,et al. Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking , 2022, ECCV.
[171] Hao Guo,et al. Visual Attention Consistency for Human Attribute Recognition , 2022, International Journal of Computer Vision.
[172] Albert Gu,et al. It's Raw! Audio Generation with State-Space Models , 2022, ICML.
[173] Xian Wu,et al. Knowledge matters: Chest radiology report generation with general and specific knowledge , 2021, Medical Image Anal..
[174] Ross B. Girshick,et al. Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[175] Eneko Agirre,et al. Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey , 2021, ACM Comput. Surv..
[176] Albert Gu,et al. Efficiently Modeling Long Sequences with Structured State Spaces , 2021, ICLR.
[177] Atri Rudra,et al. Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers , 2021, NeurIPS.
[178] Kaiqi Huang,et al. Spatial and Semantic Consistency Regularizations for Pedestrian Attribute Recognition , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[179] Jun Wan,et al. Cascaded Split-and-Aggregate Learning with Feature Recombination for Pedestrian Attribute Recognition , 2021, International Journal of Computer Vision.
[180] Jure Leskovec,et al. Combiner: Full Attention Transformer with Sparse Computation Cost , 2021, NeurIPS.
[181] Hao Tian,et al. ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation , 2021, ArXiv.
[182] Xu Sun,et al. Contrastive Attention for Automatic Chest X-ray Report Generation , 2021, FINDINGS.
[183] Pieter Abbeel,et al. Decision Transformer: Reinforcement Learning via Sequence Modeling , 2021, NeurIPS.
[184] Yuexian Zou,et al. Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[185] Shiliang Zhang,et al. Large-Scale Spatio-Temporal Person Re-Identification: Algorithms and Benchmark , 2021, IEEE Transactions on Circuits and Systems for Video Technology.
[186] Nitish Srivastava,et al. An Attention Free Transformer , 2021, ArXiv.
[187] Qi Tian,et al. Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation , 2021, ECCV Workshops.
[188] Wengang Zhou,et al. Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[189] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[190] Pichao Wang,et al. TransReID: Transformer-based Object Re-Identification , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[191] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[192] Tsung-Hui Chang,et al. Generating Radiology Reports via Memory-driven Transformer , 2020, EMNLP.
[193] Mirco Ravanelli,et al. Attention Is All You Need In Speech Separation , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[194] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[195] Yilong Yin,et al. CFVMNet: A Multi-branch Network for Vehicle Re-identification Based on Common Field of View , 2020, ACM Multimedia.
[196] Shao-Yi Chien,et al. Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network , 2020, ECCV.
[197] C. Ré,et al. HiPPO: Recurrent Memory with Optimal Polynomial Projections , 2020, NeurIPS.
[198] Ming Tang,et al. Identity-Guided Human Semantic Parsing for Person Re-Identification , 2020, ECCV.
[199] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[200] Rongrong Ji,et al. Salience-Guided Cascaded Suppression Network for Person Re-Identification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[201] R. Chellappa,et al. The Devil is in the Details: Self-Supervised Attention for Vehicle Re-Identification , 2020, ECCV.
[202] Qingming Huang,et al. Parsing-Based View-Aware Embedding Network for Vehicle Re-Identification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[203] Yang Yang,et al. Relation-Aware Pedestrian Attribute Recognition with Graph Convolutional Networks , 2020, AAAI.
[204] Luc Van Gool,et al. Probabilistic Regression for Visual Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[205] L. Gool,et al. Know Your Surroundings: Exploiting Scene Information for Object Tracking , 2020, ECCV.
[206] Hao Liu,et al. Person Attribute Recognition by Sequence Contextual Relation Learning , 2020, IEEE Transactions on Circuits and Systems for Video Technology.
[207] Gang Yu,et al. High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[208] M. Zaghloul,et al. IEEE Transactions , 2020, Computer.
[209] Daguang Xu,et al. When Radiology Report Generation Meets Knowledge Graph , 2020, AAAI.
[210] Calton Pu,et al. Looking GLAMORous: Vehicle Re-Id in Heterogeneous Cameras Networks with Global and Local Attention , 2020, ArXiv.
[211] H. Ai,et al. Rethinking the Distribution Gap of Person Re-identification with Camera-Based Batch Normalization , 2020, ECCV.
[212] Wenjun Zeng,et al. Uncertainty-Aware Multi-Shot Knowledge Distillation for Image-Based Object Re-Identification , 2020, AAAI.
[213] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[214] Wei Jiang,et al. Stripe-based and attribute-aware network: a two-branch deep model for vehicle re-identification , 2019, ArXiv.
[215] Yu Wu,et al. Pose-Guided Feature Alignment for Occluded Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[216] Yichen Wei,et al. Vehicle Re-Identification With Viewpoint-Aware Metric Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[217] Chunhua Shen,et al. Part-Guided Attention Learning for Vehicle Re-Identification , 2019, arXiv.org.
[218] Yang Yang,et al. ABD-Net: Attentive but Diverse Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[219] Bing He,et al. Part-Regularized Near-Duplicate Vehicle Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[220] Xin Jin,et al. Semantics-Aligned Representation Learning for Person Re-identification , 2019, AAAI.
[221] Andrea Cavallaro,et al. Omni-Scale Feature Learning for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[222] L. Gool,et al. Learning Discriminative Model Prediction for Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[223] Cuiling Lan,et al. Relation-Aware Global Attention for Person Re-Identification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[224] Eric P. Xing,et al. Knowledge-driven Encode, Retrieve, Paraphrase for Medical Image Report Generation , 2019, AAAI.
[225] Wei Jiang,et al. Bag of Tricks and a Strong Baseline for Deep Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[226] B. Luo,et al. Pedestrian Attribute Recognition: A Survey , 2019, Pattern Recognit..
[227] Michael Felsberg,et al. ATOM: Accurate Tracking by Overlap Maximization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[228] Eric P. Xing,et al. Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation , 2018, NeurIPS.
[229] Xiong Chen,et al. Learning Discriminative Features with Multiple Granularities for Person Re-Identification , 2018, ACM Multimedia.
[230] Xuan Zhang,et al. Multi-Target, Multi-Camera Tracking by Hierarchical Clustering: Recent Progress on DukeMTMC Project , 2017, CVPR 2017.
[231] Longhui Wei,et al. Person Transfer GAN to Bridge Domain Gap for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[232] Pietro Liò,et al. Graph Attention Networks , 2017, ICLR.
[233] Xiaogang Wang,et al. HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[234] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[235] Wu Liu,et al. Large-scale vehicle re-identification in urban surveillance videos , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).
[236] Tiejun Huang,et al. Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[237] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[238] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[239] Qi Tian,et al. Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[240] Clement J. McDonald,et al. Preparing a collection of radiology examinations for distribution and retrieval , 2015, J. Am. Medical Informatics Assoc..
[241] Xiaoou Tang,et al. Pedestrian Attribute Recognition At Far Distance , 2014, ACM Multimedia.
[242] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[243] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[244] Qiang Chen,et al. Network In Network , 2013, ICLR.
[245] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[246] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[247] S. Hochreiter,et al. Long Short-Term Memory , 1997, Neural Computation.
[248] Ziyang Wang,et al. Semi-Mamba-UNet: Pixel-Level Contrastive Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation , 2024, ArXiv.
[249] Yinuo Wang,et al. MambaMorph: a Mamba-based Backbone with Contrastive Feature Learning for Deformable MR-CT Registration , 2024, ArXiv.
[250] S. Baccus,et al. S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces , 2022, NeurIPS.
[251] Hai-Miao Hu,et al. Correlation Graph Convolutional Network for Pedestrian Attribute Recognition , 2022, IEEE Transactions on Multimedia.
[252] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[253] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[254] Philip S. Yu,et al. A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[255] R. E. Kalman,et al. A New Approach to Linear Filtering and Prediction Problems , 2002 .