论文信息 - Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction

Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction

Hyperspectral image (HSI) reconstruction aims to recover the 3D spatial-spectral signal from a 2D measurement in the coded aperture snapshot spectral imaging (CASSI) system. The HSI representations are highly similar and correlated across the spectral dimension. Modeling the inter-spectra interactions is beneficial for HSI reconstruction. However, existing CNN-based methods show limitations in capturing spectral-wise similarity and long-range dependencies. Besides, the HSI information is modulated by a coded aperture (physical mask) in CASSI. Nonetheless, current algorithms have not fully explored the guidance effect of the mask for HSI restoration. In this paper, we propose a novel framework, Mask-guided Spectral-wise Transformer (MST), for HSI reconstruction. Specifically, we present a Spectral-wise Multi-head Self-Attention (S-MSA) that treats each spectral feature as a token and calculates self-attention along the spectral dimension. In addition, we customize a Mask-guided Mechanism (MM) that directs S- MSA to pay attention to spatial regions with high-fidelity spectral representations. Extensive experiments show that our MST significantly outperforms state-of-the-art (SOTA) methods on simulation and real HSI datasets while requiring dramatically cheaper computational and memory costs. https://github.com/caiyuanhao1998/MST/

[1] H. Pfister,et al. Learning to Generate Realistic Noisy Images via Pixel-level Noise-aware Adversarial Training , 2022, NeurIPS.

[2] L. Gool,et al. HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] L. Gool,et al. Flow-Guided Sparse Transformer for Video Deblurring , 2022, ICML.

[4] Wenming Yang,et al. RFormer: Transformer-Based Generative Adversarial Network for Real Fundus Image Restoration on a New Clinical Benchmark , 2022, IEEE Journal of Biomedical and Health Informatics.

[5] Lu Yuan,et al. Dynamic DETR: End-to-End Object Detection with Dynamic Attention , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6] Luc Van Gool,et al. SwinIR: Image Restoration Using Swin Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[7] Matthijs Douze,et al. XCiT: Cross-Covariance Image Transformers , 2021, NeurIPS.

[8] L. Gool,et al. Video Super-Resolution Transformer , 2021, ArXiv.

[9] Jianmin Bao,et al. Uformer: A General U-Shaped Transformer for Image Restoration , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Qi Tian,et al. Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation , 2021, ECCV Workshops.

[11] Zhuowen Tu,et al. Pose Recognition with Cascade Transformers , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Shu-Tao Xia,et al. TokenPose: Learning Keypoint Tokens for Human Pose Estimation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13] Cordelia Schmid,et al. ViViT: A Video Vision Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[14] Guangming Shi,et al. Deep Gaussian Scale Mixture Prior for Spectral Compressive Imaging , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Aggelos K. Katsaggelos,et al. Snapshot Compressive Imaging: Theory, Algorithms, and Applications , 2021, IEEE Signal Processing Magazine.

[16] Tao Xiang,et al. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Wankou Yang,et al. TransPose: Keypoint Localization via Transformer , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[18] Shirin Jalali,et al. GAP-net for Snapshot Compressive Imaging , 2020, 2012.08364.

[19] Wen Gao,et al. Pre-Trained Image Processing Transformer , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[21] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.

[22] Xin Yuan,et al. End-to-End Low Cost Compressive Spectral Imaging with Spatial-Spectral Self-Attention , 2020, ECCV.

[23] Zhenming Yu,et al. Snapshot multispectral endomicroscopy. , 2020, Optics letters.

[24] Kurt Keutzer,et al. Visual Transformers: Token-based Image Representation and Processing for Computer Vision , 2020, ArXiv.

[25] Hua Huang,et al. DNU: Deep Non-Local Unrolling for Computational Spectral Imaging , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.

[27] Xiangyu Zhang,et al. Learning Delicate Local Representations for Multi-Person Pose Estimation , 2020, ECCV.

[28] Ying Fu,et al. Computational Hyperspectral Imaging Based on Dimension-Discriminative Low-Rank Tensor Recovery , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29] V. Athitsos,et al. lambda-Net: Reconstruct Hyperspectral Images From a Snapshot Measurement , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30] Ashish Vaswani,et al. Stand-Alone Self-Attention in Vision Models , 2019, NeurIPS.

[31] Ying Fu,et al. Hyperspectral Image Reconstruction Using a Deep Spatial-Spectral Prior , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Hui Guo,et al. Hyperspectral Imaging With Random Printed Mask , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Qionghai Dai,et al. Rank Minimization for Snapshot Compressive Imaging , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34] Giljoo Nam,et al. High-quality hyperspectral reconstruction using a spectral prior , 2017, ACM Trans. Graph..

[35] Guangming Shi,et al. Adaptive Nonlocal Sparse Representation for Dual-Camera Compressive Hyperspectral Imaging , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[37] Xiangtao Zheng,et al. Hyperspectral Image Superresolution by Transfer Learning , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[38] Stephen Lin,et al. Computational Snapshot Multispectral Cameras: Toward dynamic capture of the spectral world , 2016, IEEE Signal Processing Magazine.

[39] Yoichi Sato,et al. Exploiting Spectral-Spatial Correlation for Coded Hyperspectral Image Restoration , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Xin Yuan,et al. Generalized alternating projection based total variation minimization for compressive sensing , 2015, 2016 IEEE International Conference on Image Processing (ICIP).

[41] Gonzalo R. Arce,et al. Compressive Hyperspectral Imaging via Approximate Message Passing , 2015, IEEE Journal of Selected Topics in Signal Processing.

[42] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[43] Xin Yuan,et al. Compressive Hyperspectral Imaging With Side Information , 2015, IEEE Journal of Selected Topics in Signal Processing.

[44] Guangming Shi,et al. Dual-camera design for coded aperture snapshot spectral imaging. , 2015, Applied optics.

[45] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[46] Guillermo Sapiro,et al. Coded aperture compressive temporal imaging , 2013, Optics express.

[47] Min H. Kim,et al. 3D imaging spectroscopy for measuring hyperspectral patterns on solid objects , 2012, ACM Trans. Graph..

[48] David J. Brady,et al. Multiframe image estimation for coded aperture snapshot spectral imagers. , 2010, Applied optics.

[49] Yann LeCun,et al. Learning Fast Approximations of Sparse Coding , 2010, ICML.

[50] Rama Chellappa,et al. Tracking via object reflectance using a hyperspectral video camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[51] Stephen Lin,et al. A prism-based system for multispectral video acquisition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[52] Xiaobai Sun,et al. Video rate spectral imaging using a coded aperture snapshot spectral imager. , 2009, Optics express.

[53] Ashwin A. Wagadarikar,et al. Single disperser design for coded aperture snapshot spectral imaging. , 2008, Applied optics.

[54] Shree K. Nayar,et al. Multispectral Imaging Using Multiplexed Illumination , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[55] M. Borengasser,et al. Hyperspectral Remote Sensing: Principles and Applications , 2007 .

[56] José M. Bioucas-Dias,et al. A New TwIST: Two-Step Iterative Shrinkage/Thresholding Algorithms for Image Restoration , 2007, IEEE Transactions on Image Processing.

[57] Lorenzo Bruzzone,et al. Classification of hyperspectral remote sensing images with support vector machines , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[58] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[59] Bruce J. Tromberg,et al. Face recognition in hyperspectral images , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[60] S. Shapshay,et al. Detection of preinvasive cancer cells , 2000, Nature.

[61] A F Goetz,et al. Imaging Spectrometry for Earth Remote Sensing , 1985, Science.

[62] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[63] Y. Fu,et al. A New Backbone for Hyperspectral Image Reconstruction , 2021, ArXiv.

[64] Xin Yuan,et al. Supplementary Material for “Self-supervised Neural Networks for Spectral Snapshot Compressive Imaging” , 2021 .

[65] Guolan Lu,et al. Medical hyperspectral imaging: a review , 2014, Journal of biomedical optics.

[66] Qionghai Dai,et al. Supplementary Document : Spatial-spectral Encoded Compressive Hyperspectral Imaging , 2014 .