浏览全部资源
扫码关注微信
1.福州大学 先进制造学院,福建 泉州 362252
2.中国福建光电信息科学与技术创新实验室,福建 福州 350116
3.福州大学 物理与信息工程学院,福建 福州 350116
Published:25 June 2024,
Received:13 December 2023,
Revised:20 February 2024,
移动端阅览
林坚普,吴镇城,王崑赋等.级联残差优化Transformer网络的图像超分辨率重建[J].光学精密工程,2024,32(12):1902-1914.
LIN Jianpu,WU Zhencheng,WANG Kunfu,et al.Cascade residual-optimized image super-resolution reconstruction in Transformer network[J].Optics and Precision Engineering,2024,32(12):1902-1914.
林坚普,吴镇城,王崑赋等.级联残差优化Transformer网络的图像超分辨率重建[J].光学精密工程,2024,32(12):1902-1914. DOI: 10.37188/OPE.20243212.1902.
LIN Jianpu,WU Zhencheng,WANG Kunfu,et al.Cascade residual-optimized image super-resolution reconstruction in Transformer network[J].Optics and Precision Engineering,2024,32(12):1902-1914. DOI: 10.37188/OPE.20243212.1902.
为了扩展图像超分辨率算法中卷积神经网络在多个尺度特征上的自适应学习能力,提升网络性能,本文提出一种基于级联残差方法的Transformer网络优化结构进行图像超分辨率重建。首先,该网络采用级联残差结构,增强了网络对低阶和中阶特征的迭代复用和信息共享能力;其次,将通道注意力机制引入Transformer结构中,增强网络的特征表达和自适应学习通道权重的能力;最后,优化Transformer网络结构中的感知模块为级联感知模块,扩展网络深度,增强模型的特征表达能力。在数据集Set5,Set14,BSD100,Urban100和Manga109上进行放大2倍、3倍和4倍的重建测试并与主流方法进行对比,客观评价结果表明,在4倍放大因子的Set5数据集下,本文方法所得图像的峰值信噪比对比其他主流方法平均值提升1.14 dB,结构相似度平均值提升0.019。结合主观评价结果表明,本文方法相比其他主流方法的图像重建效果更好,恢复得到的图像纹理细节更清晰。
In order to expand the adaptive learning ability of convolutional neural network in image super-resolution algorithm on multiple scale features and improve the network performance, this paper proposed an optimization structure of Transformer network based on cascade residual method for image super-resolution reconstruction. Firstly, the network adopted a cascaded residual structure, which enhanced the iterative reuse and information sharing ability of low and middle order features; Secondly, channel attention mechanism was introduced into Transformer structure to enhance network feature expression and adaptive learning capability of channel weights; Finally, the sensing module in Transformer network structure was optimized as a cascade sensing module to expand the network depth and enhance the feature expression capability of the model. Reconstruction tests of 2x, 3x and 4x magnification were carried out on Set5, Set14, BSD100, Urban100 and Manga109 data sets and compared with mainstream methods. Objective evaluation results showed that under Set5 data set with 4x magnification factor, Compared with other mainstream methods, the peak signal-to-noise ratio of the image obtained in this paper is increased by 1.14 dB on average, and the average structural similarity is increased by 0.019. Combined with the subjective evaluation results, it is shown that the proposed method has better image reconstruction effect than other mainstream methods, and the restored image texture details are clearer.
卷积神经网络图像超分辨率重建残差网络Transformer注意力机制
convolutional neural networkimage super-resolution reconstructionresidual networktransformerattention mechanism
温剑, 邵剑飞, 刘杰, 等. 多维注意力机制与选择性特征融合的图像超分辨率重建[J]. 光学 精密工程, 2023, 31(17): 2584-2597. doi: 10.37188/ope.20233117.2584http://dx.doi.org/10.37188/ope.20233117.2584
WEN J, SHAO J F, LIU J, et al. Multidimensional attention mechanism and selective feature fusion for image super-resolution reconstruction[J]. Opt. Precision Eng., 2023, 31(17): 2584-2597.(in Chinese). doi: 10.37188/ope.20233117.2584http://dx.doi.org/10.37188/ope.20233117.2584
陈豪, 夏振平, 程成, 等. 基于Transformer-CNN的轻量级图像超分辨率重建网络[J]. 计算机应用, 2024, 44(1): 292-299.
CHEN H, XIA Z P, CHENG C, et al. Lightweight image super-resolution reconstruction network based on Transformer-CNN[J]. Journal of Computer Applications, 2024, 44(1): 292-299.(in Chinese)
寇旗旗, 李超, 程德强, 等. 基于注意力和宽激活密集残差网络的图像超分辨率重建[J]. 光学 精密工程, 2023, 31(15): 2273-2286. doi: 10.37188/ope.20233115.2273http://dx.doi.org/10.37188/ope.20233115.2273
KOU Q Q, LI C, CHENG D Q, et al. Image super-resolution reconstruction based on attention and wide-activated dense residual network[J]. Opt. Precision Eng., 2023, 31(15): 2273-2286.(in Chinese). doi: 10.37188/ope.20233115.2273http://dx.doi.org/10.37188/ope.20233115.2273
DONG C, LOY C C, HE K, et al. Image super-resolution using deep convolutional networks[J]. IEEE Trans Pattern Anal Mach Intell, 2016, 38(2): 295-307. doi: 10.1109/tpami.2015.2439281http://dx.doi.org/10.1109/tpami.2015.2439281
WU G, JIANG J J, JIANG J P, et al. Transforming image super-resolution: a ConvFormer-based efficient approach[EB/OL]. 2024: arXiv: 2401.05633. http://arxiv.org/abs/2401.05633http://arxiv.org/abs/2401.05633
SUN L, DONG J X, TANG J H, et al. Spatially-adaptive Feature Modulation for Efficient Image Super-resolution[C]. 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France. IEEE, 2023: 13144-13153. doi: 10.1109/iccv51070.2023.01213http://dx.doi.org/10.1109/iccv51070.2023.01213
TIAN C W, ZHANG X Y, ZHANG Q, et al. Image super-resolution via dynamic network[J]. CAAI Transactions on Intelligence Technology, 2023. doi: 10.1049/cit2.12297http://dx.doi.org/10.1049/cit2.12297
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA. IEEE, 2016: 770-778. doi: 10.1109/cvpr.2016.90http://dx.doi.org/10.1109/cvpr.2016.90
LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada. IEEE, 2021: 9992-10002. doi: 10.1109/iccv48922.2021.00986http://dx.doi.org/10.1109/iccv48922.2021.00986
LIANG J Y, CAO J Z, SUN G L, et al. SwinIR: image restoration using swin transformer[C]. 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Montreal, BC, Canada. IEEE, 2021: 1833-1844. doi: 10.1109/iccvw54120.2021.00210http://dx.doi.org/10.1109/iccvw54120.2021.00210
DING M Y, XIAO B, CODELLA N, et al. DaViT: dual attention vision transformers[C]. European Conference on Computer Vision. Cham: Springer, 2022: 74-92. doi: 10.1007/978-3-031-20053-3_5http://dx.doi.org/10.1007/978-3-031-20053-3_5
ZAMIR S W, ARORA A, KHAN S, et al. Restormer: efficient transformer for high-resolution image restoration[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA. IEEE, 2022: 5718-5729. doi: 10.1109/cvpr52688.2022.00564http://dx.doi.org/10.1109/cvpr52688.2022.00564
RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional Networks for Biomedical Image Segmentation[M]. Lecture Notes in Computer Science. Cham: Springer International Publishing, 2015: 234-241. doi: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. IEEE, 2018: 7132-7141. doi: 10.1109/cvpr.2018.00745http://dx.doi.org/10.1109/cvpr.2018.00745
WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. European Conference on Computer Vision. Cham: Springer, 2018: 3-19. doi: 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1
KIM J H, CHOI J H, CHEON M, et al. MAMNet: Multi-Path Adaptive Modulation Network for Image Super-Resolution[EB/OL]. 2018: arXiv: 1811.12043. http://arxiv.org/abs/1811.12043http://arxiv.org/abs/1811.12043. doi: 10.48550/arXiv.1811.12043http://dx.doi.org/10.48550/arXiv.1811.12043
SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, MA, USA. IEEE, 2015: 1-9. doi: 10.1109/cvpr.2015.7298594http://dx.doi.org/10.1109/cvpr.2015.7298594
HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA. IEEE, 2017: 2261-2269. doi: 10.1109/cvpr.2017.243http://dx.doi.org/10.1109/cvpr.2017.243
SHI W Z, CABALLERO J, HUSZÁR F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA. IEEE, 2016: 1874-1883. doi: 10.1109/cvpr.2016.207http://dx.doi.org/10.1109/cvpr.2016.207
AGUSTSSON E, TIMOFTE R. NTIRE 2017 Challenge on single image super-resolution: dataset and study[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). July 21-26, 2017. Honolulu, HI, USA. IEEE, 2017: 126-135.
WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612. doi: 10.1109/tip.2003.819861http://dx.doi.org/10.1109/tip.2003.819861
AHN N, KANG B, SOHN K A. Fast, Accurate, and lightweight super-resolution with cascading residual network[C]. European Conference on Computer Vision. Cham: Springer, 2018: 256-272. doi: 10.1007/978-3-030-01249-6_16http://dx.doi.org/10.1007/978-3-030-01249-6_16
LAI W S, HUANG J B, AHUJA N, et al. Deep laplacian pyramid networks for fast and accurate super-resolution[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA. IEEE, 2017: 5835-5843. doi: 10.1109/cvpr.2017.618http://dx.doi.org/10.1109/cvpr.2017.618
TAI Y, YANG J, LIU X M, et al. MemNet: a persistent memory network for image restoration[C]. 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy. IEEE, 2017: 4549-4557. doi: 10.1109/iccv.2017.486http://dx.doi.org/10.1109/iccv.2017.486
HUI Z, GAO X B, YANG Y C, et al. Lightweight image super-resolution with information multi-distillation network[C]. Proceedings of the 27th ACM International Conference on Multimedia. Nice France. ACM, 2019: 2024-2032. doi: 10.1145/3343031.3351084http://dx.doi.org/10.1145/3343031.3351084
KONG F Y, LI M X, LIU S W, et al. Residual local feature network for efficient super-resolution[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). New Orleans, LA, USA. IEEE, 2022: 765-775. doi: 10.1109/cvprw56347.2022.00092http://dx.doi.org/10.1109/cvprw56347.2022.00092
LU Z S, LI J C, LIU H, et al. Transformer for single image super-resolution[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). New Orleans, LA, USA. IEEE, 2022: 456-465. doi: 10.1109/cvprw56347.2022.00061http://dx.doi.org/10.1109/cvprw56347.2022.00061
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you Need[EB/OL]. 2017: arXiv: 1706.03762. http://arxiv.org/abs/1706.03762http://arxiv.org/abs/1706.03762
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[J]. ArXiv e-Prints, 2020: arXiv: 2010.11929.
HAN K, XIAO A, WU E, et al. Transformer in transformer[C]. Advances in Neural Information Processing Systems, 2021, 34: 15908-15919.
LIU T C, LEE K A, WANG Q Q, et al. Golden gemini is all you need: finding the sweet spots for speaker verification[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, 32: 2324-2337. doi: 10.1109/taslp.2024.3385277http://dx.doi.org/10.1109/taslp.2024.3385277
TIAN C W, ZHENG M H, ZUO W M, et al. A cross Transformer for image denoising[J]. Information Fusion, 2024, 102: 102043. doi: 10.1016/j.inffus.2023.102043http://dx.doi.org/10.1016/j.inffus.2023.102043
CHEN Z, ZHANG Y L, GU J J, et al. Dual aggregation transformer for image super-resolution[C]. 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France. IEEE, 2023: 12278-12287. doi: 10.1109/iccv51070.2023.01131http://dx.doi.org/10.1109/iccv51070.2023.01131
ZHANG Y L, LI K P, LI K, et al. Image super-resolution using very deep residual channel attention networks[C]. European Conference on Computer Vision. Cham: Springer, 2018: 294-310. doi: 10.1007/978-3-030-01234-2_18http://dx.doi.org/10.1007/978-3-030-01234-2_18
0
Views
45
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution