火箭军工程大学 兵器发射理论与技术国家重点学科实验室,陕西 西安 710025
扫 描 看 全 文
蔡伟,姜波,蒋昕昊等.非成对训练样本条件下的红外图像生成[J].光学精密工程,2023,31(24):3651-3661.
CAI Wei,JIANG Bo,JIANG Xinhao,et al.Infrared image generation with unpaired training samples[J].Optics and Precision Engineering,2023,31(24):3651-3661.
蔡伟,姜波,蒋昕昊等.非成对训练样本条件下的红外图像生成[J].光学精密工程,2023,31(24):3651-3661. DOI: 10.37188/OPE.20233124.3651.
CAI Wei,JIANG Bo,JIANG Xinhao,et al.Infrared image generation with unpaired training samples[J].Optics and Precision Engineering,2023,31(24):3651-3661. DOI: 10.37188/OPE.20233124.3651.
针对实测红外图像数据集构建难度大、测试制作成本高的问题,本文提出了一种非成对训练样本条件下的生成对抗网络VTIGAN,实现不同场景下可见光到红外(Visible-to-Infrared,VTI)的高质量图像转换。VTIGAN在transformer模块基础引入了一个新的生成器学习图像内容映射关系,通过重组目标风格特性实现图像风格转换,同时使用PathGAN作为判别器强化模型的图像细节信息生成能力,最后联合对抗损失、多层对比损失、风格相似性损失、同一性损失四种损失函数对模型训练过程进行约束。将VTIGAN与其他主流算法在可见光-红外数据集上进行了广泛的对比实验,同时使用峰值信噪比、结构相似度和Frechét inception distance 3个评价指标进行定量评估和主观定性评价。实验结果表明,VTIGAN相比于次优的UGATIT算法在PSNR,SSIM和FID三个指标上分别提升了3.1%,2.8%和11.3%,有效实现了非成对训练样本条件下可见光到红外的图像转换,且对于复杂场景的抗干扰能力更强,生成的红外图像清晰度高、细节特征完整、真实感强。
To address the difficulties in constructing an infrared image dataset from measurements and the high cost of testing and production, this study proposes a generation countermeasure VTIGAN for unpaired training samples to achieve high-quality image conversion from visible-to-infrared in different scenarios. VTIGAN introduces a new generator inspired by the transformer module to learn the mapping relationship of image content, whereby image style conversion is realized by reorganizing the characteristics of the target style. PathGAN is used as a discriminator to strengthen the image detail information generation ability of the model. Finally, four loss functions, namely, resistance, multi-layer contrast, style similarity, and identity losses are combined to constrain the model training process. VTIGAN was compared with other mainstream algorithms in a wide range of experiments on visible infrared datasets. Three evaluation indicators, namely, peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and Fréchet inception distance (FID), were used for quantitative and subjective qualitative evaluation. The experimental results show that VTIGAN improves PSNR, SSIM, and FID by 3.1%, 2.8%, and 11.3%, respectively, compared with the suboptimal UGATIT algorithm, which effectively realizes image conversion from visible light to infrared under the condition of unpaired training samples, demonstrating stronger anti-interference ability for complex scenes. The generated infrared images have high definition, complete details, and strong realism.
图像处理红外图像仿真图像风格迁移生成对抗网络Transformer
image processinginfrared image simulationimage style transfergenerative adversarial networktransformer
LATGER J, CATHALA T, DOUCHIN N, et al. Simulation of active and passive infrared images using the SE-WORKBENCH[C]. Defense and Security Symposium. Proc SPIE 6543, Infrared Imaging Systems: Design, Analysis, Modeling, and Testing XVIII, Orlando, Florida, USA. 2007, 6543: 11-25. doi: 10.1117/12.724822http://dx.doi.org/10.1117/12.724822
JOHNSON K, CURRAN A, LESS D, et al. MuSES: a new heat and signature management design tool for virtual prototyping[C]. Proceedings of the 9th Annual Ground Target Modelling & Validation Conference. 1998.
SCHOTT J R, BROWN S D, RAQUEÑO R V, et al. An advanced synthetic image generation model and its application to multi/hyperspectral algorithm development[J]. Canadian Journal of Remote Sensing, 1999, 25(2): 99-111. doi: 10.1080/07038992.1999.10874709http://dx.doi.org/10.1080/07038992.1999.10874709
杨艳春, 高晓宇, 党建武, 等. 基于WEMD和生成对抗网络重建的红外与可见光图像融合[J]. 光学 精密工程, 2022, 30(3): 320-330. doi: 10.37188/OPE.20223003.0320http://dx.doi.org/10.37188/OPE.20223003.0320
YANG Y C, GAO X Y, DANG J W, et al. Infrared and visible image fusion based on WEMD and generative adversarial network reconstruction[J]. Opt. Precision Eng., 2022, 30(3): 320-330.(in Chinese). doi: 10.37188/OPE.20223003.0320http://dx.doi.org/10.37188/OPE.20223003.0320
杨植凯, 卜乐平, 王腾, 等. 基于循环一致性对抗网络的室内火焰图像场景迁移[J]. 光学 精密工程, 2020, 28(3): 745-758. doi: 10.3788/ope.20202803.0745http://dx.doi.org/10.3788/ope.20202803.0745
YANG Z K, BU L P, WANG T, et al. Scenemigration of indoor flame image based on Cycle-Consistent adversarial networks[J]. Opt. Precision Eng., 2020, 28(3): 745-758.(in Chinese). doi: 10.3788/ope.20202803.0745http://dx.doi.org/10.3788/ope.20202803.0745
徐哲, 耿杰, 蒋雯, 等. 联合训练生成对抗网络的半监督分类方法[J]. 光学 精密工程, 2021, 29(5): 1127-1135. doi: 10.37188/OPE.20212905.1127http://dx.doi.org/10.37188/OPE.20212905.1127
XU Z, GENG J, JIANG W, et al. Co-training generative adversarial networks for semi-supervised classification method[J]. Opt. Precision Eng., 2021, 29(5): 1127-1135.(in Chinese). doi: 10.37188/OPE.20212905.1127http://dx.doi.org/10.37188/OPE.20212905.1127
ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-image translation with conditional adversarial networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).21-26, 2017, Honolulu, HI, USA. IEEE, 2017: 5967-5976. doi: 10.1109/cvpr.2017.632http://dx.doi.org/10.1109/cvpr.2017.632
ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. 2017 IEEE International Conference on Computer Vision (ICCV). 22-29, 2017, Venice, Italy. IEEE, 2017: 2242-2251. doi: 10.1109/iccv.2017.244http://dx.doi.org/10.1109/iccv.2017.244
KIM T, CHA M, KIM H, et al. Learning to discover cross-domain relations with generative adversarial networks[C]. Proceedings of the 34th International Conference on Machine Learning - Volume 70.6-11,2017, Sydney, NSW, Australia. New York: ACM, 2017: 1857-1865.
YI Z L, ZHANG H, TAN P, et al. DualGAN: unsupervised dual learning for image-to-image translation[C]. 2017 IEEE International Conference on Computer Vision (ICCV).22-29, 2017, Venice, Italy. IEEE, 2017: 2868-2876. doi: 10.1109/iccv.2017.310http://dx.doi.org/10.1109/iccv.2017.310
CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]. Proceedings of the 37th International Conference on Machine Learning. New York: ACM, 2020: 1597-1607.
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. 2020: arXiv: 2010.11929. https://arxiv.org/abs/2010.11929.pdfhttps://arxiv.org/abs/2010.11929.pdf
CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).21-26, 2017, Honolulu, HI, USA. IEEE, 2017: 1800-1807. doi: 10.1109/cvpr.2017.195http://dx.doi.org/10.1109/cvpr.2017.195
GUTMANN M, HYVÄRINEN A. Noise-contrastive estimation: a new estimation principle for unnormalized statistical models[C]. Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2010: 297-304.
PANG Y, LIN J, QIN T, et al. Image-to-Image Translation: Methods and Applications[EB/OL]. 2021: arXiv: 2101.08629. https://arxiv.org/abs/2101.08629.pdfhttps://arxiv.org/abs/2101.08629.pdf. doi: 10.1109/tmm.2021.3109419http://dx.doi.org/10.1109/tmm.2021.3109419
WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612. doi: 10.1109/tip.2003.819861http://dx.doi.org/10.1109/tip.2003.819861
CHE Z P, PURUSHOTHAM S, CHO K, et al. Recurrent neural networks for multivariate time series with missing values[J]. Scientific Reports, 2018, 8: 6085. doi: 10.1038/s41598-018-24271-9http://dx.doi.org/10.1038/s41598-018-24271-9
HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs Trained by A two Time-Scale Update Rule Converge to a Local Nash Equilibrium[EB/OL]. 2017: arXiv: 1706.08500. https://arxiv.org/abs/1706.08500.pdfhttps://arxiv.org/abs/1706.08500.pdf
CHANG H Y, WANG Z X, CHUANG Y Y. Domain-Specific Mappings for Generative Adversarial Style Transfer[M]. Computer Vision - ECCV 2020. Cham: Springer International Publishing, 2020: 573-589. doi: 10.1007/978-3-030-58598-3_34http://dx.doi.org/10.1007/978-3-030-58598-3_34
KIM J, KIM M, KANG H, et al. U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation[EB/OL]. 2019: arXiv: 1907.10830. https://arxiv.org/abs/1907.10830.pdfhttps://arxiv.org/abs/1907.10830.pdf
YANG G, TANG H, SHI H, et al. Global and Local Alignment Networks for Unpaired Image-to-Image Translation[EB/OL]. 2021: arXiv: 2111.10346. https://arxiv.org/abs/2111.10346.pdfhttps://arxiv.org/abs/2111.10346.pdf
PARK T, EFROS A A, ZHANG R, et al. Contrastive Learning for Unpaired Image-to-Image Translation[EB/OL]. 2020: arXiv: 2007.15651. https://arxiv.org/abs/2007.15651.pdfhttps://arxiv.org/abs/2007.15651.pdf. doi: 10.1007/978-3-030-58545-7_19http://dx.doi.org/10.1007/978-3-030-58545-7_19
0
Views
1
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution