论文信息 - Deep learning-based virtual refocusing of images using an engineered point-spread function

Deep learning-based virtual refocusing of images using an engineered point-spread function

We present a virtual image refocusing method over an extended depth of field (DOF) enabled by cascaded neural networks and a double-helix point-spread function (DH-PSF). This network model, referred to as W-Net, is composed of two cascaded generator and discriminator network pairs. The first generator network learns to virtually refocus an input image onto a user-defined plane, while the second generator learns to perform a cross-modality image transformation, improving the lateral resolution of the output image. Using this W-Net model with DH-PSF engineering, we extend the DOF of a fluorescence microscope by ~20-fold. This approach can be applied to develop deep learning-enabled image reconstruction methods for localization microscopy techniques that utilize engineered PSFs to improve their imaging performance, including spatial resolution and volumetric imaging throughput. Super-resolution imaging1–6 and high-throughput volumetric7–10 fluorescence microscopy provide unprecedented access to submicron-scale phenomena in various fields such as life sciences and engineering. However, improvements in the imaging resolution and throughput require relatively complex optical setups, usually through a time-consuming mechanical scanning procedure, which may also entail additional digital image registration and stitching procedures10–12. A particular method used for super-resolution imaging is based on localization microscopy1–3,13. Point spread function (PSF) engineering, including the use of astigmatic, multi-plane, double-helix (DH), and tetrapod PSFs, has been successfully used to improve the spatial resolution and depth of field (DOF) in localization microscopy 5,6,14–17. However, the reconstruction of a sample’s image that is convolved with an engineered PSF generally requires sparsity of the samples. Even for state-of-the-art localization algorithms18, it is challenging to perform fast, accurate three-dimensional (3D) localization over an extended axial range with an increasing emitter density. Recently emerging data-driven image reconstruction approaches have demonstrated performance advances for solving inverse problems in various microscopic imaging modalities19–21. Some of these methods help accelerate the 3D imaging process, preventing potential phototoxicity and photobleaching of the sample as well as improving the image resolution and throughput22–31. In particular, recent studies have demonstrated successful applications of deep learning methods for advancing 3D fluorescence microscopy. For example, Boyd et al. proposed DeepLoco29 and a kernel-based loss function to outperform traditional 3D localization algorithms in terms of both speed and accuracy. Zhang et al. developed smNet28, which can extract not only the 3D locations of fluorescence emitters but also their orientations and potential wave-front distortions. Nehme et al. demonstrated DeepSTORM3D27, which utilizes a design method to jointly optimize the imaging PSF and the corresponding localization algorithm, extending the axial localization range up to 4 μm using a 1.45 NA/ 100 × objective lens. Wu et al. introduced Deep-Z25, a deep learning-based virtual refocusing method that can refocus a given input image using a user-defined digital propagation matrix to an arbitrary surface within the sample volume, significantly improving the DOF and imaging throughput with only one input image. An extension of the same virtual refocusing approach using multiple input images has also been demonstrated for 3D volumetric imaging using recurrent neural networks30. Figure 1. Structure of W-Net, containing two cascaded neural networks: (1) virtual image refocusing network and (2) crossmodality image transformation network optimized for DH-PSF. We use a joint training method for these two cascaded networks, where the output of the refocusing network is directly fed into the cross-modality image transformation network, and simultaneously minimize the losses of the refocusing (LR) and cross-modality transformation (LC) networks. GR is the generator of the refocusing network. The image intensity is log-scaled for a better contrast. Discriminators are not shown for simplicity (detailed in the Supporting Information.) In this Letter, we present a deep learning-based method, referred to as W-Net (Figure 1), to perform both virtual refocusing and cross-modality image transformation of a single fluorescence microscopy image (input), acquired using an engineered PSF (i.e., double-helix PSF: DH-PSF5), onto user-defined planes within the sample volume. We trained our W-Net model as two cascaded neural networks to (1) virtually refocus a PSF engineered input image onto desired planes within the sample volume, and (2) perform image reconstruction at each virtually refocused plane, based on a cross-modality transformation method24, which yields an output image equivalent to, e.g., a confocal fluorescence microscopy image of the same sample. The second step computationally resolves the spatial features of the sample convolved with the DH-PSF. Unlike standard iterative deconvolution techniques that can be used for images acquired with an engineered PSF, the presented method is based on a single pass-forward through a neural network and it does not require any iterations or mechanical scanning for 3D imaging of a sample owing to its digital refocusing capability. Our W-Net design contains two cascaded U-Net structures32, trained using a conditional generative adversarial network (cGAN)33, as shown in Figure 1. Along with the DH-PSF input image, the first U-Net also receives, as a second input channel, a user-defined digital propagation matrix (DPM), which has the same size as the first channel. Each pixel value in the DPM determines the axial propagation distance of the corresponding pixel of the input image. Therefore, applying a series of DPMs on a single input image is equivalent to virtually scanning the specimen’s volume. In this study, we used experimentally acquired images for training and blindly testing W-Net in order to accurately capture and consider various complexities introduced by nonideal experimental conditions19,24. Using this new computational framework and DH-PSF, we digitally extended the DOF of the imaging system approximately 20 times, which was demonstrated by imaging nanoparticles. The presented method can be broadly applied for advancing localization microscopy techniques that utilize engineered PSFs by merging virtual refocusing with rapid volumetric image reconstruction. To demonstrate the extended DOF and the DH-PSF reconstruction capability of our neural network model, we trained a WNet model where the input images (50-nm fluorescent nanobeads) were acquired using the DH-PSF through a 63×/1.4-NA oilimmersion objective lens (see Methods section for microscopy system details); the native DOF of this objective lens is ~ ± 0.15 μm. In addition to this W-Net model that was trained with DH-PSF input images, we also trained a second W-Net model (for comparison) using wide-field images (as input) acquired on the same fluorescence microscope (63×/1.4-NA objective) by removing the DH-PSF phase mask. The ground truth image volumetric data corresponding to the same samples used for both WNet models were acquired through a confocal microscope using the same objective lens (see Methods section). During the training phase of each W-Net model, we utilized digital propagation matrices such that both the input and target images were randomly defocused. At the blind testing stage, for quantifying the W-Net image inference, we designed the DPMs to virtually refocus the input images with different defocusing distances to a selected target plane (defined by z = 0 μm). We evaluated the quality of the W-Net output images refocused to z = 0 μm by calculating the image correlation coefficient between the output and ground truth (confocal) images over a region of 73.7 × 73.7 μm using 75 individual nanobeads (Figure 2a). Furthermore, we utilized a customized localization algorithm, the Jaccard index (JI) and the lateral root mean square error (RMSE) metrics to quantify the localization performance of the W-Net output images, which also helped us measure the effective DOF (see the details in the Supporting Information). The results of these experimental analyses are summarized in Figures 2 and 3. Figure 2. Quantifications of the (a–c) extended DOF by W-Net and DH-PSF engineering and (d–f) comparison to U-Net. The lateral RMSE is shown only when beads are detected (JI > 0, dot–dashed vertical lines in b, c in corresponding colors). In the left panels, the blue and red curves represent the W-Net outputs with DH and wide-field inputs, respectively. The yellow curves represent results of out-of-focus wide-field images obtained by mechanical scanning, which also correspond to the inputs used for the red line. The opaque yellow regions in b and c represent the native DOF defined by the objective lens, while the opaque blue region represents the extended DOF with the W-Net output. (d–f) Comparisons of W-Net to the single U-Net model. All metrics are calculated using confocal images as the ground truth over a 73.7 × 73.7 μm region using 75 nanobeads. We can directly compare the image correlation coefficients to qualitatively assess the extended DOF of the W-Net output (Figure 2a). Using input images that are captured with the DH-PSF, W-Net output significantly improves the imaging performance in the axial range of −6 to + 6 μm. Without the use of DH-PSF, W-Net still successfully improves the image correlation coefficient, but the performance of this model deteriorates outside the range of ±3 μm. To quantify effective DOF we used localization analysis of the output images utilizing confocal images as reference (detailed in the Supporting Information); furthermore, we compared JI and the lateral RMSE of the localization results for detectability and localization accuracy, respectively (

[1] G. Zack,et al. Automatic measurement of sister chromatid exchange frequency. , 1977, The journal of histochemistry and cytochemistry : official journal of the Histochemistry Society.

[2] Qionghai Dai,et al. Video-rate imaging of biological dynamics at centimetre scale and micrometre resolution , 2019, Nature Photonics.

[3] Samuel J. Lord,et al. Three-dimensional, single-molecule fluorescence imaging beyond the diffraction limit by using a double-helix point spread function , 2009, Proceedings of the National Academy of Sciences.

[4] Aydogan Ozcan,et al. Single-shot autofocusing of microscopy images using deep learning , 2020, ArXiv.

[5] Loic A. Royer,et al. Applications, Promises, and Pitfalls of Deep Learning for Fluorescence Image Reconstruction , 2018 .

[6] Lucien E. Weiss,et al. Precise Three-Dimensional Scan-Free Multiple-Particle Tracking over Large Axial Ranges with Tetrapod Point Spread Functions , 2015, Nano letters.

[7] J. Lippincott-Schwartz,et al. Imaging Intracellular Fluorescent Proteins at Nanometer Resolution , 2006, Science.

[8] Michael J Rust,et al. Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM) , 2006, Nature Methods.

[9] R. Prevedel,et al. Brain-wide 3D imaging of neuronal activity in Caenorhabditis elegans with sculpted light , 2013, Nature Methods.

[10] Lucien E. Weiss,et al. DeepSTORM3D: dense 3D localization microscopy and PSF design by deep learning , 2020, Nature Methods.

[11] Joshua W Shaevitz,et al. Whole-brain calcium imaging with cellular resolution in freely behaving Caenorhabditis elegans , 2015, Proceedings of the National Academy of Sciences.

[12] S. Hess,et al. Three-dimensional sub–100 nm resolution fluorescence microscopy of thick samples , 2008, Nature Methods.

[13] Joe Chalfoun,et al. MIST: Accurate and Scalable Microscopy Image Stitching Tool with Stage Modeling and Error Minimization , 2017, Scientific Reports.