Scene classification of remote sensing images based on multiscale features fusion

Zhou YANG; Shu-yang WANG; Chen-hui MA

doi:10.3788/OPE.20182612.3099

您当前的位置：

首页 >

文章列表页 >

Scene classification of remote sensing images based on multiscale features fusion

Information science | 更新时间：2020-07-05

- Scene classification of remote sensing images based on multiscale features fusion
- Optics and Precision Engineering Vol. 26, Issue 12, Pages: 3099-3107(2018)
- 作者机构：
  
  火箭军工程大学信息工程系, 陕西西安 710025
- 作者简介：
- 基金信息：
- DOI：10.3788/OPE.20182612.3099
  CLC： TP751
- Received：13 March 2018，
  
  Accepted：06 May 2018，
  
  Published：25 December 2018
- 稿件说明：
移动端阅览
Zhou YANG, Shu-yang WANG, Chen-hui MA. Scene classification of remote sensing images based on multiscale features fusion[J]. Optics and precision engineering, 2018, 26(12): 3099-3107.
DOI：

Zhou YANG, Shu-yang WANG, Chen-hui MA. Scene classification of remote sensing images based on multiscale features fusion[J]. Optics and precision engineering, 2018, 26(12): 3099-3107. DOI： 10.3788/OPE.20182612.3099.

摘要

为了解决遥感图像场景分类中因样本量小而分类精度不高的问题，提出了一种基于多尺度特征融合（MSFF）的分类方法。首先，对遥感图像进行尺度变换，得到同一遥感源图像的多个不同尺度图像。接着，将其分别输入深度卷积神经网络（DCNN）中进行卷积操作。然后，将各卷积层和全连接层提取出的不同尺度特征进行降维和编码/平均池化操作。最后，将各尺度特征进行编码融合并利用多核支持向量机（MKSVM）进行场景分类。在两个公开遥感图像数据集UCM Land-Use和NWPU-RESISC45中进行试验，分类精度最高分别达到98.91%和99.33%。本文方法能够利用不同尺度的图像特征，结合低、中、高层语义表示，使融合特征的可辨识性更高，同时使用多核支持向量机提高了深度网络学习的泛化能力，因此分类效果更好。

Abstract

To solve the low accuracy problem of remote sensing image scene classification due to small sample sizes

a classification method was proposed based on Multiscale Features Fusion (MSFF). First

the remote sensing images were scaled to obtain several different scale images of the same remote sensing image. Thereafter

they were inputted into a Deep Convolutional Neural Network (DCNN) for convolutional operation

and the different scale features of the convolutional and the fully connected layers were reduced and coded or average pooled. Finally

the different scale features were coded and fused

and a multikernel support vector machine was used to classify the scenes. In the two public remote sensing image data sets UCM Land-Use and NWPU-RESISC45

the highest classification accuracy of the experiment are 98.91% and 99.33%

respectively. This method can use image features of different scales and low

middle and high-level semantic representations combined

thus the fusion feature is more recognizable. Furthermore

the use of a multikernel support vector machine improves the generalization of the deep network learning ability

so the classification effect is better.

关键词

Keywords

references

PERRONNIN F, DANCE C. Fisher kernels on visual vocabularies for image categorization[C]. Computer Vision and Pattern Recognition , 2007. CVPR'07. IEEE Conference on. IEEE, 2007: 1-8. https://www.researchgate.net/publication/224716280_Fisher_Kernels_on_Visual_Vocabularies_for_Image_Categorization

JIANG Y.Texture description based on multiresolution moments of image histograms[J]. Optical Engineering, 2008, 47(3):037005.

VAN DE SANDE KE, GEVERS T, SNOEK CG. Evaluating color descriptors for object and scene recognition[J]. IEEE Trans Pattern Anal Mach Intell, 2010, 32(9):1582-1596.

YANG Y, NEWSAM S. Bag-of-visual-words and spatial extensions for land-use classification[C]. Proceedings of the 18 th SIGSPATIAL international conference on advances in geographic information systems . ACM , 2010: 270-279. http://cn.bing.com/academic/profile?id=c2a723c1b843aa7f09c09f0838817f1a&encoded=0&v=paper_preview&mkt=zh-cn

CHENG G, GUO L, ZHAO T, et al.. Automatic landslide detection from remote-sensing imagery using a scene classification method based on BoVW and pLSA[J]. International Journal of Remote Sensing, 2013, 34(1):45-59.

RAJA R, MANSOOR ROOMI S M, DHARMALAKSHMI D. Outdoor scene classification using invariant features[J]. 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2013:1-4.

CHERIYADAT A M. Unsupervised feature learning for aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(1):439-451.

CHEN S, TIAN Y L. Pyramid of spatial relatons for scene-level land use classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(4):1947-1957.

HU F, XIA G S, WANG Z, ET AL. Unsupervised feature learning via spectral clustering of multidimensional patches for remotely sensed scene classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2015, 8(5):2015-2030.

ZHAO B, ZHONG Y, XIA G S, et al.. Dirichlet-derived multiple topic scene classification model fusing heterogeneous features for high spatial resolution remote sensing imagery[J]. IEEE Trans. Geosci. Remote Sens, 2016, 54(4):2108-2123.

ABDEL H O, MOHAMED A, JIANG H, et al.. Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition[C]. Acoustics , Speech and Signal Processing ( ICASSP ), 2012 IEEE International Conference on . IEEE , 2012: 4277-4280. https://www.researchgate.net/publication/261119155_Applying_Convolutional_Neural_Networks_concepts_to_hybrid_NN-HMM_model_for_speech_recognition?ev=auth_pub

KARPATHY A, TODERICI G, SHETTY S, et al.. Large-scale video classification with convolutional neural networks[C]. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition . 2014: 1725-1732. http://cn.bing.com/academic/profile?id=c64e5403d75a1c61828212c28b41c818&encoded=0&v=paper_preview&mkt=zh-cn

BALABAN S. Deep learning and face recognition: the state of the art[C]. Biometric and Surveillance Technology for Human and Activity Identification XⅡ . International Society for Optics and Photonics , 2015, 9457: 94570B. https://www.researchgate.net/publication/300795728_Deep_learning_and_face_recognition_the_state_of_the_art

GIRSHICK R, DONAHUE J, DARRELL T, et al.. Region-based convolutional networks for accurate object detection and segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(1):142-158.

ZHENG S, JAYASUMANA S, ROMERA P B, et al.. Conditional random fields as recurrent neural networks[C]. Proceedings of the IEEE International Conference on Computer Vision . 2015: 1529-1537. https://www.researchgate.net/publication/304409176_Conditional_Random_Fields_as_Recurrent_Neural_Networks

HE K, ZHANG X, REN S, et al.. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2016: 770-778. http://cn.bing.com/academic/profile?id=97b946375f370deba4a114ec88edd604&encoded=0&v=paper_preview&mkt=zh-cn

VAKALOPOULOU M, KARANTZALOS K, KOMODAKIS N, et al.. Building detection in very high resolution multispectral data with deep learning features[C]. Geoscience and Remote Sensing Symposium ( IGARSS ), 2015 IEEE International. IEEE , 2015: 1873-1876. https://www.researchgate.net/publication/308828699_Building_detection_in_very_high_resolution_multispectral_data_with_deep_learning_features

SZEGEDY C, LIU W, JIA Y, et al.. Going deeper with convolutions[J]. 2014: 1-9. http://cn.bing.com/academic/profile?id=9502979ad980c7103eca9034a162b820&encoded=0&v=paper_preview&mkt=zh-cn

NOGUEIRA K, PENATTI O A B, DOS S J A. Towards better exploiting convolutional neural networks for remote sensing scene classification[J]. Pattern Recognition, 2017, 61:539-556.

HU F, XIA G S, HU J, et al.. Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery[J]. Remote Sensing, 2015, 7(11):14680-14707.

ZHANG F, DU B, ZHANG L. Scene classification via a gradient boosting random convolutional network framework[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(3):1793-1802.

ZHAO W, DU S. Scene classification using multi-scale deeply described visual words[J]. International Journal of Remote Sensing, 2016, 37(17):4119-4131.

OTHMAN E, BAZI Y, ALAJLAN N, et al.. Using convolutional features and a sparse autoencoder for land-use scene classification[J]. International Journal of Remote Sensing, 2016, 37(10):2149-2167.

许夙晖, 慕晓冬, 赵鹏, 等.利用多尺度特征与深度网络对遥感影像进行场景分类[J].测绘学报, 2016, 45(7):834-840.

XU S H, MU X D, ZHAO P, et al.. Scene classification of remote sensing image based on multi-scale feature and deep neural network[J]. Acta Geodaetica et Cartographica Sinica, 2016, 45(7):834-840.(in Chinese)

WANG G, FAN B, XIANG S, et al.. Aggregating rich hierarchical features for scene classification in remote sensing imagery[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2017, 10(9):4104-4115.

LI E, XIA J, DU P, et al.. Integrating multilayer features of convolutional neural networks for remote sensing scene classific-ation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(10):5653-5665.

HE K, ZHANG X, REN S, et al.. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Trans Pattern Anal Mach Intell, 2015, 37(9):1904-1916.

SIVIC J, ZISSERMAN A. Video Google: A text retrieval approach to object matching in videos[C]. Proceedings of the Ninth IEEE International Conference on Computer Vision ( ICCV 2003), 2003: 1470-1477.

JEGOU H, DOUZE M, SCHMID C, et al.. Aggregating local descriptors into a compact image representation[J]. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010:3304-3311.

WANG J, YANG J, YU K, et al.. Locality-constrained Linear Coding for image classification[J]. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010:3360-3367.

PERRONNIN F, S NCHEZ J, MENSINK T. Improving the fisher kernel for large-scale image classification[J]. Computer Vision-ECCV 2010, 2010:143-156.

CHEN Z, LI J, WEI L, et al.. Multiple-kernel SVM based multiple-task oriented data mining system for gene expression data analysis[J]. Expert Systems with Applications, 2011, 38(10):12151-12159.

孙翠娟.基于K型核函数的支持向量机[J].淮海工学院学报(自然科学版), 2006, 15(4):4-7.

SUN C J. Support vector machine based K-type kernel function[J]. Journal of Huaihai Institute of Technology (Natural Sciences Edition), 2006, 15(4):4-7. (in Chinese)

VEDALDI A, FULKERSON B. VLFeat: An open and portable library of computer vision algorithms[C]. Proceedings of the 18th ACM international conference on Multimedia . ACM , 2010: 1469-1472. http://cn.bing.com/academic/profile?id=090abeb6315049fb3880ca18099c9b44&encoded=0&v=paper_preview&mkt=zh-cn

JIA Y, SHELHAMER E, DONAHUE J, et al.. Caffe: Convolutional architecture for fast feature embedding[C]. Proceedings of the 22 nd ACM international conference on Multimedia . ACM , 2014: 675-678. http://cn.bing.com/academic/profile?id=abc296778c785ab221526fbe14bf636d&encoded=0&v=paper_preview&mkt=zh-cn

CHATFIELD K, SIMONYAN K, VEDALDI A, et al.. Return of the devil in the details:delving deep into convolutional nets[J]. Proceedings of the British Machine Vision Conference, 2014.

SERMANET P, EIGEN D, ZHANG X, et al.. OverFeat: integrated recognition localization and detection using convolutional networks[Z]. Eprint Arxiv , 2013. http://cn.bing.com/academic/profile?id=e2254c28a7e7e487ffd80f86c5add522&encoded=0&v=paper_preview&mkt=zh-cn

ZHANG F, DU B, ZHANG L. Saliency-guided unsupervised feature learning for scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(4):2175-2184.

ZHAO B, ZHONG Y, ZHANG L, et al.. The Fisher kernel coding framework for high spatial resolution scene classification[J]. Remote Sensing, 2016, 8(2):157.

NEGREL R, PICARD D, GOSSELIN P H. Evaluation of second-order visual features for land-use classification[C]. Content - Based Multimedia Indexing ( CBMI ), 2014 12 th International Workshop on . IEEE , 2014: 1-5. https://www.researchgate.net/publication/271417363_Evaluation_of_Second-order_Visual_Features_for_Land-Use_Classification

PENATTI O A B, NOGUEIRA K, DOS S J A. Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops . 2015: 44-51.

CASTELLUCCIO M, POGGI G, SANSONE C, et al.. Land use classification in remote sensing images by convolutional neural networks[Z]. arXiv preprint arXiv : 1508.00092, 2015.

Views

161

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Spatial adaptation and frequency fusion network for single remote sensing image super-resolution

Generate adversarial network for super-resolution reconstruction of remote sensing images by fusing edge enhancement and non-local modules

Fast extraction of buildings from remote sensing images by fusion of CNN and Transformer

Fusion of fractal geometric features Resnet remote sensing image building segmentation

Building segmentation in remote sensing image based on multiscale-feature fusion dilated convolution resnet

Related Author

YANG Yichuan

MA Zhongqi

ZHOU Xinyao

ZHENG Fujian

HUANG Hong

LIU Jie

QI Ruo

HAN Ke

Related Institution

Beijing Institute of Space Machinery and Electronics

Key Laboratory of Optoelectronic Technology and System， Ministry of Education， Chongqing University

School of Computer and Information Engineering， Harbin University of Commerce

College of Measurement and Control Technology and Communication Engineering， Harbin University of Science and Technology

School of Information Science and Technology， Shijiazhuang Tiedao University

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰