浏览全部资源
扫码关注微信
1.西北民族大学 中国民族语言文字信息技术教育部重点实验室,甘肃 兰州 730030
2.西北民族大学 数学与计算机科学学院,甘肃 兰州 730030
Received:22 August 2022,
Revised:28 October 2022,
Published:25 June 2023
移动端阅览
胡文瑾,唐慧媛,乐超洋等.结合多尺度上下文信息的唐卡小样本目标检测[J].光学精密工程,2023,31(12):1859-1869.
HU Wenjin,TANG Huiyuan,YUE Chaoyang,et al.Few-shot object detection on Thangka via multi-scale context information[J].Optics and Precision Engineering,2023,31(12):1859-1869.
胡文瑾,唐慧媛,乐超洋等.结合多尺度上下文信息的唐卡小样本目标检测[J].光学精密工程,2023,31(12):1859-1869. DOI: 10.37188/OPE.20233112.1859.
HU Wenjin,TANG Huiyuan,YUE Chaoyang,et al.Few-shot object detection on Thangka via multi-scale context information[J].Optics and Precision Engineering,2023,31(12):1859-1869. DOI: 10.37188/OPE.20233112.1859.
通过对图像中感兴趣的对象进行分类与定位,能够帮助人们理解唐卡图像丰富的语义信息,促进文化传承。针对唐卡图像样本较少,背景复杂,检测目标存在遮挡,检测精度不高等问题,本文提出了一种结合多尺度上下文信息和双注意力引导的唐卡小样本目标检测算法。首先,构建了一个新的多尺度特征金字塔,学习唐卡图像的多层级特征和上下文信息,提高模型对多尺度目标的判别能力。其次,在特征金字塔末端加入双注意力引导模块,提升模型对关键特征的表征能力,同时降低噪声的影响。最后利用Rank & Sort Loss替换交叉熵分类损失,简化模型训练的复杂度并提升检测精度。实验结果表明,所提出的方法在唐卡数据集和COCO数据集上的10-shot实验中,平均检测精度分别达到了19.7%和11.2%。
Classifying and locating objects of interest in Thangka images can help people understand the rich semantic information of Thangka and promote cultural inheritance. To address the problems of insufficient Thangka image samples, the complex background, the occlusion of detection targets, and the low detection accuracy, this paper proposes a few-shot object detection algorithm for Thangka images that combines multi-scale context information and dual attention guidance. First, a new multi-scale feature pyramid is constructed to learn the multi-level features and contextual information of Thangka images and improve the ability of the model to discriminate multi-scale targets. Second, a dual attention guidance module is added at the end of the feature pyramid to improve the ability of the model to represent key features while reducing the impact of noise. Finally, Rank&Sort Loss is used to replace the cross-entropy classification loss, which simplifies the model training process and increases the detection accuracy. Experimental results indicate that the proposed method achieved a mean average precision of 19.7% and 11.2% in 10-shot experiments using a Thangka dataset and the COCO dataset, respectively.
GIRSHICK R , DONAHUE J , DARRELL T , et al . Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation [C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition . 23 - 28 , 2014, Columbus, OH, USA. IEEE , 2014: 580 - 587 . doi: 10.1109/cvpr.2014.81 http://dx.doi.org/10.1109/cvpr.2014.81
GIRSHICK R . Fast R-CNN [C]. 2015 IEEE International Conference on Computer Vision (ICCV) . 7 - 13 , 2015, Santiago, Chile. IEEE , 2016: 1440 - 1448 . doi: 10.1109/iccv.2015.169 http://dx.doi.org/10.1109/iccv.2015.169
REN S Q , HE K M , GIRSHICK R , et al . Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 6 ): 1137 - 1149 . doi: 10.1109/tpami.2016.2577031 http://dx.doi.org/10.1109/tpami.2016.2577031
LIU W , ANGUELOV D , ERHAN D , et al . SSD : Single Shot Multibox Detector [M]. Computer Vision - ECCV 2016 . Cham : Springer International Publishing , 2016 : 21 - 37 . doi: 10.1007/978-3-319-46448-0_2 http://dx.doi.org/10.1007/978-3-319-46448-0_2
REDMON J , DIVVALA S , GIRSHICK R , et al . You Only Look Once: Unified, Real-Time Object Detection [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . 27 - 30 , 2016, Las Vegas, NV, USA. IEEE , 2016: 779 - 788 . doi: 10.1109/cvpr.2016.91 http://dx.doi.org/10.1109/cvpr.2016.91
REDMON J , FARHADI A . YOLO9000: Better, Faster, Stronger [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . 21 - 26 , 2017, Honolulu, HI, USA. IEEE , 2017: 6517 - 6525 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690
REDMON J , FARHADI A . YOLOv3: An Incremental Improvement [EB/OL]. 2018 : arXiv : 1804 . 02767 . https://arxiv.org/abs/1804.02767 https://arxiv.org/abs/1804.02767 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690
BOCHKOVSKIY A , WANG C Y , LIAO H Y M . YOLOv4: Optimal Speed and Accuracy of Object Detection [EB/OL]. 2020 : arXiv : 2004 . 10934 . https://arxiv.org/abs/2004.10934 https://arxiv.org/abs/2004.10934
CHEN Q , WANG Y M , YANG T , et al . You Only Look One-Level Feature [C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . 20 - 25 , 2021, Nashville, TN, USA. IEEE , 2021: 13034 - 13043 . doi: 10.1109/cvpr46437.2021.01284 http://dx.doi.org/10.1109/cvpr46437.2021.01284
KARLINSKY L , SHTOK J , HARARY S , et al . RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection [C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . 15 - 20 , 2019, Long Beach, CA, USA. IEEE , 2020: 5192 - 5201 . doi: 10.1109/cvpr.2019.00534 http://dx.doi.org/10.1109/cvpr.2019.00534
YAN X P , CHEN Z L , XU A N , et al . Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning [C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 272,2019 , Seoul, Korea (South). IEEE , 2020 : 9576 - 9585 . doi: 10.1109/iccv.2019.00967 http://dx.doi.org/10.1109/iccv.2019.00967
CHEN H , WANG Y L , WANG G Y , et al . LSTD: A Low-Shot Transfer Detector for Object Detection [C]. Proceedings of the AAAI Conference on Artificial Intelligence , 2018 , 32 ( 1 ). doi: 10.1609/aaai.v32i1.11716 http://dx.doi.org/10.1609/aaai.v32i1.11716
KANG B Y , LIU Z , WANG X , et al . Few-Shot Object Detection via Feature Reweighting [C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). October 27 - November 2 , 2019 , Seoul, Korea (South). IEEE , 2020 : 8419 - 8428 . doi: 10.1109/iccv.2019.00851 http://dx.doi.org/10.1109/iccv.2019.00851
WANG X , HUANG T E , DARRELL T , et al . Frustratingly Simple Few-Shot Object Detection [EB/OL]. 2020 : arXiv : 2003 . 06957 . https://arxiv.org/abs/2003.06957 https://arxiv.org/abs/2003.06957 . doi: 10.18653/v1/2021.findings-acl.88 http://dx.doi.org/10.18653/v1/2021.findings-acl.88
SUN B , LI B H , CAI S C , et al . FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding [C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2025,2021 , Nashville, TN, USA. IEEE , 2021 : 7348 - 7358 . doi: 10.1109/cvpr46437.2021.00727 http://dx.doi.org/10.1109/cvpr46437.2021.00727
OKSUZ K , CAM B C , AKBAS E , et al . Rank & Sort Loss for Object Detection and Instance Segmentation [C]. 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . 10 - 17 , 2021, Montreal, QC, Canada. IEEE , 2022: 2989 - 2998 . doi: 10.1109/iccv48922.2021.00300 http://dx.doi.org/10.1109/iccv48922.2021.00300
LIN T Y , DOLLÁR P , GIRSHICK R , et al . Feature Pyramid Networks For Object Detection [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . 21 - 26 , 2017, Honolulu, HI, USA. IEEE , 2017: 936 - 944 . doi: 10.1109/cvpr.2017.106 http://dx.doi.org/10.1109/cvpr.2017.106
LIN T Y , MAIRE M , BELONGIE S , et al . Microsoft COCO: Common Objects in Context [EB/OL]. 2014 : arXiv : 1405 . 0312 . https://arxiv.org/abs/1405.0312 https://arxiv.org/abs/1405.0312 . doi: 10.1007/978-3-319-10602-1_48 http://dx.doi.org/10.1007/978-3-319-10602-1_48
WU J X , LIU S T , HUANG D , et al . Multi-Scale Positive Sample Refinement for Few-Shot Object Detection [M]. Computer Vision - ECCV 2020 . Cham : Springer International Publishing , 2020 : 456 - 472 . doi: 10.1007/978-3-030-58517-4_27 http://dx.doi.org/10.1007/978-3-030-58517-4_27
0
Views
171
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution