浏览全部资源
扫码关注微信
1.兰州交通大学 电子与信息工程学院,甘肃 兰州 730000
2.兰州大学 信息科学与工程学院,甘肃 兰州 730000
[ "张家骏(1994-),男,甘肃武威人,硕士研究生,2014年于青海民族大学获得学士学位,研究方向为基于深度学习的图像修复。 E-mail: zhangjiajunwork@163.com" ]
[ "廉 敬(1983-),男,甘肃兰州人,博士,教授,2010年于兰州交通大学获得硕士学位,2017年于兰州大学获得博士学位,主要从事图像处理与模式识别研究。 E-mail: lian322scc@163.com" ]
纸质出版日期:2024-02-25,
收稿日期:2023-07-19,
修回日期:2023-09-05,
移动端阅览
张家骏,廉敬,刘冀钊等.利用图像平滑结构信息指导图像修复[J].光学精密工程,2024,32(04):549-564.
ZHANG Jiajun,LIAN Jing,LIU Jizhao,et al.Using image smoothing structure information to guide image inpainting[J].Optics and Precision Engineering,2024,32(04):549-564.
张家骏,廉敬,刘冀钊等.利用图像平滑结构信息指导图像修复[J].光学精密工程,2024,32(04):549-564. DOI: 10.37188/OPE.20243204.0549.
ZHANG Jiajun,LIAN Jing,LIU Jizhao,et al.Using image smoothing structure information to guide image inpainting[J].Optics and Precision Engineering,2024,32(04):549-564. DOI: 10.37188/OPE.20243204.0549.
利用图像结构特征进行图像修复,是近年来在深度学习技术广泛应用背景下出现的新方法。应用该方法可以在缺失区域内生成合理的内容,但图像修复结果过于依赖图像结构的提取内容,且在实际训练中会出现错误的持续传播和累积,一旦图像结构存在噪声或失真会直接影响到图像的生成质量。该方法处在探索应用阶段,尚存在网络训练难度大、鲁棒性较差、生成图像上下文语义不一致等问题。为此,本文提出了一种图像平滑结构指导修复的并行网络结构。图像平滑结构的生成内容不直接作为下一级网络的输入,只为网络的解码层提供指导信息。同时,为了更好地匹配和均衡结构与图像之间的特征关系,本文结合transformer提出了一种多尺度特征指导模块。该模块利用transformer联系全局特征的强大建模能力,对结构和图像纹理之间的特征进行匹配和均衡。实验结果表明,本文方法在三个常用的数据集上能够有效地恢复图像缺损内容,并且可以作为图像编辑工具实现目标移除。
Using image structure features for image inpainting is a new method that has emerged in recent years with the widespread application of deep learning techniques. This method can generate plausible content within missing areas, but the restoration results heavily rely on the extracted content of image structures. In practical training, errors can propagate and accumulate, directly impacting the quality of the generated image when there is noise or distortion in the image structure. This method is still in the exploratory phase and faces challenges such as difficulty in network training, poor robustness, and inconsistent semantic context in generated images.To address these issues, this paper proposed a parallel network structure for image inpainting guided by smooth image structures. The generated content of the smooth image structure was not directly used as input for the next-level network but served as guidance information for the decoding layer of network. Additionally, to better match and balance the feature relationship between the structure and the image, this paper combined transformer and introduces a multi-scale feature guidence module. This module utilized the powerful modeling capability of transformers to establish connections between global features, matching and balancing features between structure and image textures.Experimental results demonstrate that the proposed method effectively restores missing content in images on three commonly used datasets and can be used as an image editing tool for object removal.
图像修复深度学习平滑结构Transformer
image inpaintingdeep learningsmooth structuretransformer
HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507. doi: 10.1126/science.1127647http://dx.doi.org/10.1126/science.1127647
GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative Adversarial Networks[EB/OL]. 2014: arXiv: 1406.2661. http://arxiv.org/abs/1406.2661.pdfhttp://arxiv.org/abs/1406.2661.pdf
KINGMA D P, WELLING M. Auto-encoding variational Bayes[J]. ArXiv e-Prints, 2013: arXiv: 1312.6114.
IIZUKA S, SIMO-SERRA E, ISHIKAWA H. Globally and locally consistent image completion[J]. ACM Transactions on Graphics, 2017, 36(4): 1-14. doi: 10.1145/3072959.3073659http://dx.doi.org/10.1145/3072959.3073659
DEMIR U, UNAL G. Patch-based image inpainting with generative adversarial networks[J]. ArXiv e-Prints, 2018: arXiv: 1803.07422.
YU J H, LIN Z, YANG J M, et al. Free-form image inpainting with gated convolution[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South). IEEE, 2019: 4470-4479. doi: 10.1109/iccv.2019.00457http://dx.doi.org/10.1109/iccv.2019.00457
YU J H, LIN Z, YANG J M, et al. Generative image inpainting with contextual attention[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. IEEE, 2018: 5505-5514. doi: 10.1109/cvpr.2018.00577http://dx.doi.org/10.1109/cvpr.2018.00577
YI Z L, TANG Q, AZIZI S, et al. Contextual residual aggregation for ultra high-resolution image inpainting[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA. IEEE, 2020: 7505-7514. doi: 10.1109/cvpr42600.2020.00753http://dx.doi.org/10.1109/cvpr42600.2020.00753
LIU H Y, JIANG B, SONG Y B, et al. Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations[M]. Computer Vision – ECCV 2020. Cham: Springer International Publishing, 2020: 725-741. doi: 10.1007/978-3-030-58536-5_43http://dx.doi.org/10.1007/978-3-030-58536-5_43
NAZERI K, NG E, JOSEPH T, et al. EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning[EB/OL]. 2019: arXiv: 1901.00212. http://arxiv.org/abs/1901.00212.pdfhttp://arxiv.org/abs/1901.00212.pdf. doi: 10.1109/iccvw.2019.00408http://dx.doi.org/10.1109/iccvw.2019.00408
REN Y R, YU X M, ZHANG R N, et al. StructureFlow: image inpainting via structure-aware appearance flow[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South). IEEE, 2019: 181-190. doi: 10.1109/iccv.2019.00027http://dx.doi.org/10.1109/iccv.2019.00027
XIONG W, YU J H, LIN Z, et al. Foreground-aware image inpainting[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA. IEEE, 2019: 5833-5841. doi: 10.1109/cvpr.2019.00599http://dx.doi.org/10.1109/cvpr.2019.00599
BERTALMIO M, SAPIRO G, CASELLES V, et al. Image inpainting[C]. Proceedings of the 27th annual conference on Computer graphics and interactive techniques- SIGGRAPH, ACM, 2000. doi: 10.1145/344779.344972http://dx.doi.org/10.1145/344779.344972
PERONA P, MALIK J. Scale-space and edge detection using anisotropic diffusion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12(7): 629-639. doi: 10.1109/34.56205http://dx.doi.org/10.1109/34.56205
SHEN J H, CHAN T. Mathematical models for local nontexture inpaintings[J]. SIAM J Appl Math, 2002, 62: 1019-1043. doi: 10.1137/s0036139900368844http://dx.doi.org/10.1137/s0036139900368844
CHAN T F, SHEN J H. Nontexture inpainting by curvature-driven diffusions[J]. Journal of Visual Communication and Image Representation, 2001, 12(4): 436-449. doi: 10.1006/jvci.2001.0487http://dx.doi.org/10.1006/jvci.2001.0487
ELAD M, AHARON M. Image denoising via sparse and redundant representations over learned dictionaries[J]. IEEE Transactions on Image Processing, 2006, 15(12): 3736-3745. doi: 10.1109/tip.2006.881969http://dx.doi.org/10.1109/tip.2006.881969
IRANI M, PELEG S. Improving resolution by image registration[J]. CVGIP: Graphical Models and Image Processing, 1991, 53(3): 231-239. doi: 10.1016/1049-9652(91)90045-lhttp://dx.doi.org/10.1016/1049-9652(91)90045-l
CRIMINISI A, PÉREZ P, TOYAMA K. Region filling and object removal by exemplar-based image inpainting[J]. IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 2004, 13(9): 1200-1212. doi: 10.1109/tip.2004.833105http://dx.doi.org/10.1109/tip.2004.833105
BARNES C, SHECHTMAN E, FINKELSTEIN A, et al. PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing[M]. Seminal Graphics Papers: Pushing the Boundaries, Volume 2. New York, NY, USA: ACM, 2023: 619-629. doi: 10.1145/3596711.3596777http://dx.doi.org/10.1145/3596711.3596777
HUANG J B, KANG S B, AHUJA N, et al. Image completion using planar structure guidance[J]. ACM Transactions on Graphics, 2014, 33(4): 1-10. doi: 10.1145/2601097.2601205http://dx.doi.org/10.1145/2601097.2601205
PATHAK D, KRÄHENBÜHL P, DONAHUE J, et al. Context encoders: feature learning by inpainting[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA. IEEE, 2016: 2536-2544. doi: 10.1109/cvpr.2016.278http://dx.doi.org/10.1109/cvpr.2016.278
YANG C, LU X, LIN Z, et al. High-resolution image inpainting using multi-scale neural patch synthesis[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA. IEEE, 2017: 4076-4084. doi: 10.1109/cvpr.2017.434http://dx.doi.org/10.1109/cvpr.2017.434
CHEN M Y, LIU Z, YE L W, et al. Attentional coarse-and-fine generative adversarial networks for image inpainting[J]. Neurocomputing, 2020, 405: 259-269. doi: 10.1016/j.neucom.2020.03.090http://dx.doi.org/10.1016/j.neucom.2020.03.090
QUAN W Z, ZHANG R S, ZHANG Y, et al. Image inpainting with local and global refinement[J]. IEEE Transactions on Image Processing, 1809, 31: 2405-2420.
ZHANG Y L, WANG Y Y, DONG J Y, et al. A joint guidance-enhanced perceptual encoder and atrous separable pyramid-convolutions for image inpainting[J]. Neurocomputing, 2020, 396: 1-12. doi: 10.1016/j.neucom.2020.01.068http://dx.doi.org/10.1016/j.neucom.2020.01.068
LIU G L, REDA F A, SHIH K J, et al. Image Inpainting for Irregular Holes Using Partial Convolutions[M]. Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 89-105. doi: 10.1007/978-3-030-01252-6_6http://dx.doi.org/10.1007/978-3-030-01252-6_6
XIE C H, LIU S H, LI C, et al. Image inpainting with learnable bidirectional attention maps[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South). IEEE, 2019: 8857-8866. doi: 10.1109/iccv.2019.00895http://dx.doi.org/10.1109/iccv.2019.00895
ZHENG C X, CHAM T J, CAI J F. Pluralistic image completion[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA. IEEE, 2019: 1438-1447. doi: 10.1109/cvpr.2019.00153http://dx.doi.org/10.1109/cvpr.2019.00153
ZHAO L, MO Q H, LIN S H, et al. UCTGAN: Diverse image inpainting based on unsupervised cross-space translation[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA. IEEE, 2020: 5740-5749. doi: 10.1109/cvpr42600.2020.00578http://dx.doi.org/10.1109/cvpr42600.2020.00578
XU L M, ZENG X H, LI W S, et al. Multi-granularity generative adversarial nets with reconstructive sampling for image inpainting[J]. Neurocomputing, 2020, 402: 220-234. doi: 10.1016/j.neucom.2020.04.011http://dx.doi.org/10.1016/j.neucom.2020.04.011
PENG J L, LIU D, XU S C, et al. Generating diverse structure for image inpainting with hierarchical VQ-VAE[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN, USA. IEEE, 2021: 10770-10779. doi: 10.1109/cvpr46437.2021.01063http://dx.doi.org/10.1109/cvpr46437.2021.01063
RAZAVI A, VAN DEN OORD A, VINYALS O. Generating diverse high-fidelity images with VQ-VAE-2[EB/OL]. 2019: arXiv: 1906. 00446. http://arxiv.org/abs/1906.00446.pdfhttp://arxiv.org/abs/1906.00446.pdf
YANG S Y, WANG Y, CAI H Y, et al. Residual inpainting using selective free-form attention[J]. Neurocomputing, 2022, 510: 149-158. doi: 10.1016/j.neucom.2022.09.041http://dx.doi.org/10.1016/j.neucom.2022.09.041
LIAN J, LIU J Z, YANG Z, et al. A pulse-number-adjustable MSPCNN and its image enhancement application[J]. IEEE Access, 2078, 9: 161069-161086.
LIAN J, YANG Z, SUN W H, et al. A fire-controlled MSPCNN and its applications for image processing[J]. Neurocomputing, 2021, 422: 150-164. doi: 10.1016/j.neucom.2020.10.020http://dx.doi.org/10.1016/j.neucom.2020.10.020
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA. IEEE, 2016: 770-778. doi: 10.1109/cvpr.2016.90http://dx.doi.org/10.1109/cvpr.2016.90
YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. ArXiv e-Prints, 2015: arXiv: 1511.07122. doi: 10.48550/arXiv.1511.07122http://dx.doi.org/10.48550/arXiv.1511.07122
JASON X D, ALVIN Q P, RUNXUE B, et al. Sampling through the lens of sequential decision making[J]. arXiv e-prints, 2022: arXiv: 2208.08056.
MIYATO T, KATAOKA T, KOYAMA M, et al. Spectral normalization for generative adversarial networks[J]. arXiv preprint, 2018: arXiv:1802.05957.
MAO X D, LI Q, XIE H R, et al. Least squares generative adversarial networks[C]. 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy. IEEE, 2017: 2813-2821. doi: 10.1109/iccv.2017.304http://dx.doi.org/10.1109/iccv.2017.304
LI J Y, WANG N, ZHANG L F, et al. Recurrent feature reasoning for image inpainting[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA. IEEE, 2020: 7757-7765. doi: 10.1109/cvpr42600.2020.00778http://dx.doi.org/10.1109/cvpr42600.2020.00778
JOHNSON J, ALAHI A, LI F F. Perceptual Losses for Real-Time Style Transfer and Super-Resolution[M]. Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016: 694-711. doi: 10.1007/978-3-319-46475-6_43http://dx.doi.org/10.1007/978-3-319-46475-6_43
LEON A G, ALEXANDER S E, MATTHIAS B, et al.Image style transfer using convolutional neural networks[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016: 2414-2423. Image Style Transfer Using Convolutional Neural Networks | IEEE Conference Publication | IEEE Xplore. doi: 10.1109/cvpr.2016.265http://dx.doi.org/10.1109/cvpr.2016.265
KARRAS T, AILA T, LAINE S, et al. Progressive growing of GANs for improved quality, stability, and variation[J]. ArXiv e-Prints, 2017: arXiv: 1710.10196.
ZHOU B L, LAPEDRIZA A, KHOSLA A, et al. Places: a 10 million image database for scene recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(6): 1452-1464. doi: 10.1109/tpami.2017.2723009http://dx.doi.org/10.1109/tpami.2017.2723009
DOERSCH C, SINGH S, GUPTA A, et al. What makes Paris look like Paris?[J]. ACM Transactions on Graphics, 2012, 31(4): 1-9. doi: 10.1145/2185520.2185597http://dx.doi.org/10.1145/2185520.2185597
ZHU M Y, HE D L, LI X, et al. Image inpainting by end-to-end cascaded refinement with mask awareness[J]. IEEE Transactions on Image Processing, 2021, 30: 4855-4866. doi: 10.1109/tip.2021.3076310http://dx.doi.org/10.1109/tip.2021.3076310
DOU J X, LUO L, YANG R M. An optimal transport approach to deep metric learning (student abstract)[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(11): 12935-12936. doi: 10.1609/aaai.v36i11.21604http://dx.doi.org/10.1609/aaai.v36i11.21604
ZHENG C X, CHAM T J, CAI J F, et al. Bridging global context interactions for high-fidelity image completion[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA. IEEE, 2022: 11502-11512. doi: 10.1109/cvpr52688.2022.01122http://dx.doi.org/10.1109/cvpr52688.2022.01122
0
浏览量
15
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构