Due to the complexity of the background in aerial images and the diversity of object categories
aerial image classification is a challenging task. In order to address the problems of low accuracy and poor generalization in traditional multi-label aerial image classification methods
a method based on recurrent neural networks was proposed.In this method
the super-pixel segmentation algorithm was first used to obtain the low-level features of the image from which an attention map was generated. Subsequently
the best image scale was obtained by cross-validation
and multi-scale attention feature graphs were embedded into aconvolutional neural network in order to extract the features of the image.Finally
tomine the correlation between labels
an improved bidirectional Long Short-Term Memory (LSTM)network was proposed
which increases the connection from the input gate to the output gate
so that the input state can efficiently control the output information of each memory unit. The forget gate and the input gate were combined into a single update gate so that the improved bidirectional LSTM network can learn long-term historical information. The results obtained by applying the proposed method to the UCM multi-label dataset indicate that for scale values of 1
1.3
and 2
the accuracy and recall rates of the model are 85.33% and 87.05% respectively
while the F1 score reached 0.862. The accuracyand recall rates are found to be higher than those of theVGGNet16 model by 7.25% and 8.94% respectively.The experimental results thus indicate that the proposed method can effectively increase the accuracy of multi-label aerial image classification.
ZHENG Y P, LI G Y, LI Y. Survey of application of deep learning in image recognition[J]. Computer Engineering and Applications , 2019, 55(12):20-36. (in Chinese)
LI X B, JIAN B T, WANG SH J. A review and comparison of optical remote sensing scene classification[J]. Radio Engineering , 2019, 49(4): 265-271. (in Chinese)
YANG ZH, MU X D, WANG SH Y, et al .. Scene classification of remote sensing images based on multi-scale features fusion[J]. Opt. Precision Eng ., 2018, 26(12): 232-240. (in Chinese)
LI Y, LIU X Y, ZHANG H Q, et al .. Optical remote sensing image retrieval based on convolution neural network[J]. Opt. Precision Eng ., 2018, 26(1): 200-207. (in Chinese)
ZEGGADA A, BENBRAIKA S, MELGANI F, et al .. Multi-label conditional random field classification for UAV images[J]. IEEE Geoscience and Remote Sensing Letters , 2018, 15(3): 399-403.
KODA S, ZEGGADA A, MELGANI F, et al .. Spatial and structured SVM for multi-label image classification[J]. IEEE Transactions on Geoscience and Remote Sensing , 2018, 56(10): 5948-5960.
BIAN X Y, FEI X J, MU N. Remote sensing image scene classification based scale-attention network[J]. Journal of Computer Applications, 2020, 40(3):872-877. (in Chinese)
ACHANTA R, SHAJI A, SMITH K, et al .. SLIC superpixel compared to state-of-the-art superpixel methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2012, 34(11): 2274-2282.
HE K, ZHANG X, REN S, et al .. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2016: 770-778.
IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[J]. Arxiv Preprint Arxiv: 1502.03167, 2015.
CHAUDHURI B, DEMIR B, CHAUDHURI S, et al .. Multi-label remote sensing image retrieval using a semi-supervised graph-theoretic method[J]. IEEE Transactions on Geoscience and Remote Sensing , 2018, 56(2):1144-1158.
YANG Y, NEWSAM S. Bag-of-visual-words and spatial extensions for land-use classification[C]. Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM , 2010: 270-279.
HUA Y, MOU L, ZHU X X. Recurrently exploring class-wise attention in a hybrid convolutional and bidirectional LSTM network for multi-label aerial image classification[J]. ISPRS journal of photogrammetry and remote sensing , 2019, 149: 188-199.
HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural computation, 1997, 9(8):1735-1780.
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. Arxiv Preprint Arxiv : 1409.1556, 2014.