Scene perception and classification of FLIR images is a key technology in target recognition and of great significance to infrared reconnaissance and guidance. To resolve the problem of scene perception and classification of FLIR images, this study proposes a multi-label infrared image classification algorithm based on weakly supervised learning. First, a multi-label image classification technique is applied to FLIR images, and the images of multiple scenes are annotated using weakly supervised techniques. Infrared image features are extracted using the ResNet-50 network with a residual structure. Second, a CSRA module is introduced to capture the different spatial regions occupied by different classes. The CSRA module can improve the feature expression performance and realize the inference calculation of topological relationships between multiple labels. Finally, the advanced loss function ASL is introduced to solve the imbalance of the number of positive and negative labels in multi-label classification. The advanced loss limits the contribution of negative samples to the loss function and focuses attention on the positive samples during training. An experiment shows that the algorithm has good adaptability and accuracy, and the accuracy can exceed 90%. The algorithm can be used to perform multi-label classification with high accuracy and adaptability.
关键词
Keywords
references
WU J X , YANG H . Linear regression-based efficient SVM learning for large-scale classification [J]. IEEE Transactions on Neural Networks and Learning Systems , 2015 , 26 ( 10 ): 2357 - 2369 . doi: 10.1109/tnnls.2014.2382123 http://dx.doi.org/10.1109/tnnls.2014.2382123
林春焕 . 弱监督学习下的多标签图像分类 [D]. 西安 : 西安电子科技大学 , 2019 .
LIN C H . Multi-label Image Classification Under Weakly Supervised Learning [D]. Xi'an : Xidian University , 2019 . (in Chinese)
SERMANET P , EIGEN D , ZHANG X , et al . Overfeat: Integrated recognition, localization and detection using convolutional networks . arXiv preprint arXiv: 1312.6229 , 2013 .
WEI Y C , XIA W , LIN M , et al . HCP: a flexible CNN framework for multi-label image classification [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2016 , 38 ( 9 ): 1901 - 1907 . doi: 10.1109/tpami.2015.2491929 http://dx.doi.org/10.1109/tpami.2015.2491929
WANG J , YANG Y , MAO J H , et al . CNN-RNN: a unified framework for multi-label image classification [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas, NV, USA . IEEE , 2016 : 2285 - 2294 . doi: 10.1109/cvpr.2016.251 http://dx.doi.org/10.1109/cvpr.2016.251
CHEN T S , XU M X , HUI X L , et al . Learning semantic-specific graph representation for multi-label image recognition [C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul , Korea (South) . IEEE , 2019 : 522 - 531 . doi: 10.1109/iccv.2019.00061 http://dx.doi.org/10.1109/iccv.2019.00061
LANCHANTIN J , WANG T L , ORDONEZ V , et al . General multi-label image classification with transformers [C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville , TN , USA . IEEE , 2021 : 16473 - 16483 . doi: 10.1109/cvpr46437.2021.01621 http://dx.doi.org/10.1109/cvpr46437.2021.01621
CHENG X , LIN H Z , WU X Y , et al . MLTR: multi-label classification with transformer [C]. 2022 IEEE International Conference on Multimedia and Expo. Taipei , Taiwan, China . IEEE , 2022 : 1 - 6 . doi: 10.1109/icme52920.2022.9860016 http://dx.doi.org/10.1109/icme52920.2022.9860016
LIU S , ZHANG L , YANG X , et al . Query2Label: A Simple Transformer Way to Multi-Label Classification [J]. arXiv preprint arXiv: 2107.10834 , 2021 .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas, NV, USA . IEEE , 2016 : 770 - 778 . doi: 10.1109/cvpr.2016.90 http://dx.doi.org/10.1109/cvpr.2016.90
ZHU K , WU J X . Residual attention: a simple but effective method for multi-label recognition [C]. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal , QC, Canada . IEEE , 2021 : 184 - 193 . doi: 10.1109/iccv48922.2021.00025 http://dx.doi.org/10.1109/iccv48922.2021.00025
CUI Y , JIA M L , LIN T Y , et al . Class-balanced loss based on effective number of samples [C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach , CA, USA . IEEE , 2019 : 9260 - 9269 . doi: 10.1109/cvpr.2019.00949 http://dx.doi.org/10.1109/cvpr.2019.00949
WU T , HUANG Q Q , LIU Z W , et al . Distribution-balanced Loss for Multi-label Classification in Long-tailed Datasets [M]. Computer Vision – ECCV 2020 . Cham : Springer International Publishing , 2020 : 162 - 178 . doi: 10.1007/978-3-030-58548-8_10 http://dx.doi.org/10.1007/978-3-030-58548-8_10
RIDNIK T , BEN-BARUCH E , ZAMIR N , et al . Asymmetric loss for multi-label classification [C]. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal , QC, Canada . IEEE , 2021 : 82 - 91 . doi: 10.1109/iccv48922.2021.00015 http://dx.doi.org/10.1109/iccv48922.2021.00015
YE J , HE J J , PENG X J , et al . Attention-driven Dynamic Graph Convolutional Network for Multi-label Image Recognition [M]. Computer Vision – ECCV 2020 . Cham : Springer International Publishing , 2020 : 649 - 665 . doi: 10.1007/978-3-030-58589-1_39 http://dx.doi.org/10.1007/978-3-030-58589-1_39
YOU R C , GUO Z Y , CUI L , et al . Cross-modality attention with semantic graph embedding for multi-label classification [J]. Proceedings of the AAAI Conference on Artificial Intelligence , 2020 , 34 ( 7 ): 12709 - 12716 . doi: 10.1609/aaai.v34i07.6964 http://dx.doi.org/10.1609/aaai.v34i07.6964
LIN T Y , GOYAL P , GIRSHICK R , et al . Focal loss for dense object detection [C]. 2017 IEEE International Conference on Computer Vision . Venice, Italy . IEEE , 2017 : 2999 - 3007 . doi: 10.1109/iccv.2017.324 http://dx.doi.org/10.1109/iccv.2017.324