Parallel path and strong attention mechanism for building segmentation in remote sensing images

YANG Jianhua; ZHANG Hao; HUA Haiyang

doi:10.37188/OPE.20233102.0234

您当前的位置：

首页 >

文章列表页 >

Parallel path and strong attention mechanism for building segmentation in remote sensing images

Information Sciences | 更新时间：2023-02-08

- Parallel path and strong attention mechanism for building segmentation in remote sensing images
- Optics and Precision Engineering Vol. 31, Issue 2, Pages: 234-245(2023)
- 作者机构：
  
  1.中国科学院光电信息处理重点实验室，辽宁沈阳 110016
  2.中国科学院沈阳自动化研究所，辽宁沈阳 110016
  3.中国科学院机器人与智能制造创新研究院，辽宁沈阳 110169
  4.中国科学院大学，北京 100049
- 作者简介：
- 基金信息：
- DOI：10.37188/OPE.20233102.0234
  CLC： P236
- Received：01 March 2022，
  
  Revised：20 March 2022，
  
  Published：25 January 2023
- 稿件说明：
移动端阅览
杨坚华,张浩,花海洋.并行路径与强注意力机制遥感图像建筑物分割[J].光学精密工程,2023,31(02):234-245.

YANG Jianhua,ZHANG Hao,HUA Haiyang.Parallel path and strong attention mechanism for building segmentation in remote sensing images[J].Optics and Precision Engineering,2023,31(02):234-245.
杨坚华,张浩,花海洋.并行路径与强注意力机制遥感图像建筑物分割[J].光学精密工程,2023,31(02):234-245. DOI： 10.37188/OPE.20233102.0234.

YANG Jianhua,ZHANG Hao,HUA Haiyang.Parallel path and strong attention mechanism for building segmentation in remote sensing images[J].Optics and Precision Engineering,2023,31(02):234-245. DOI： 10.37188/OPE.20233102.0234.

摘要

遥感图像建筑物分割广泛应用于城市规划及军事领域，是当前遥感领域的研究热点。针对遥感图像中建筑物之间尺度变化较大、建筑物遮挡、建筑物阴影与建筑物边缘相似所导致建筑物分割精度较低的问题，提出一种并行路径和强注意力机制的卷积神经网络模型。该模型基于ResNet网络残差连接的思想，以ResNet为基础网络提高网络深度，并采用卷积下采样得到并行路径，提取建筑物的多尺度特征，以减少建筑物之间尺度变化的影响。然后加入强注意力机制，增强多尺度信息的融合效果，增加不同特征之间的区分度，抑制建筑物遮挡及建筑物阴影的影响。最后，在多尺度融合特征后加入金字塔空间池化模块，抑制分割结果中建筑物内部孔洞的出现，提高分割精度。在WHU以及Massachusetts Buildings公开数据集进行实验，分别从MIoU，Recall，Precision，F1-score 4个指标对分割结果进行量化比较，在Massachusetts Buildings数据集中MIoU达到72.84%，相较于ResUNet-a提升1.46%，能够有效提高遥感影像中建筑的分割精度。

Abstract

Building segmentation in remote sensing images is widely used in urban planning and military fields， and is a current focus of research in the remote sensing field. To solve the problems of large-scale changes between buildings， building occlusion， and similar building shadows and edges in remote sensing images， which result in low building segmentation accuracy， a convolutional neural network with parallel paths and strong attention mechanism was developed. The model was based on the idea of residual connections of a ResNet network， and used ResNet as the basic network to improve the network depth and convolution downsampling to obtain parallel paths to extract multi-scale features of buildings to reduce the influence of scale changes between buildings. A strong attention mechanism was then added to enhance the fusion effect of the multi-scale information and discrimination of different features， and suppress the influence of building occlusion and shadows. Finally， a pyramid space pooling module was added after the multi-scale fusion features to suppress the appearance of holes inside the building in the segmentation result and improve the segmentation accuracy. Experiments were conducted on the WHU and Massachusetts Buildings public datasets， and the segmentation results were quantitatively compared using four indicators， namely MIoU， recall， precision， and F1-score. In the Massachusetts Buildings dataset， MIoU reaches 72.84%， which is 1.46% higher than the MIoU obtained with ResUNet-a. Thus， the model effectively improved the segmentation accuracy of buildings in remote sensing images.

关键词

Keywords

references

DU J L ， CHEN D ， WANG R S ， et al . A novel framework for 2.5-D building contouring from large-scale residential scenes ［J］. IEEE Transactions on Geoscience and Remote Sensing ， 2019 ， 57 （ 6 ）： 4121 - 4145 . doi: 10.1109/tgrs.2019.2901539 http://dx.doi.org/10.1109/tgrs.2019.2901539

LI Z B ， SHI W Z ， WANG Q M ， et al . Extracting man-made objects from high spatial resolution remote sensing images via fast level set evolutions ［J］. IEEE Transactions on Geoscience and Remote Sensing ， 2015 ， 53 （ 2 ）： 883 - 899 . doi: 10.1109/tgrs.2014.2330341 http://dx.doi.org/10.1109/tgrs.2014.2330341

GAVANKAR N L ， GHOSH S K . Automatic building footprint extraction from high-resolution satellite image using mathematical morphology ［J］. European Journal of Remote Sensing ， 2018 ， 51 （ 1 ）： 182 - 193 . doi: 10.1080/22797254.2017.1416676 http://dx.doi.org/10.1080/22797254.2017.1416676

孙伟，孙鹏翔，黄恒，等 . 基于滑动窗口的影像中建筑物特征提取方法研究［J］. 传感技术学报， 2021 ， 34 （ 8 ）： 1096 - 1101 . doi: 10.3969/j.issn.1004-1699.2021.08.014 http://dx.doi.org/10.3969/j.issn.1004-1699.2021.08.014

SUN W ， SUN P X ， HUANG H ， et al . Research on building feature extraction method in image based on sliding window ［J］. Chinese Journal of Sensors and Actuators ， 2021 ， 34 （ 8 ）： 1096 - 1101 . （in Chinese） . doi: 10.3969/j.issn.1004-1699.2021.08.014 http://dx.doi.org/10.3969/j.issn.1004-1699.2021.08.014

SOHN G ， DOWMAN I . Data fusion of high-resolution satellite imagery and LiDAR data for automatic building extraction ［J］. ISPRS Journal of Photogrammetry and Remote Sensing ， 2007 ， 62 （ 1 ）： 43 - 63 . doi: 10.1016/j.isprsjprs.2007.01.001 http://dx.doi.org/10.1016/j.isprsjprs.2007.01.001

徐胜军，欧阳朴衍，郭学源，等 . 多尺度特征融合空洞卷积 ResNet遥感图像建筑物分割［J］. 光学精密工程， 2020 ， 28 （ 7 ）： 1588 - 1599 . doi: 10.37188/ope.20202807.1588 http://dx.doi.org/10.37188/ope.20202807.1588

XU SH J ， OUYANG P Y ， GUO X Y ， et al . Building segmentation in remote sensing image based on multiscale-feature fusion dilated convolution resnet ［J］. Opt. Precision Eng. ， 2020 ， 28 （ 7 ）： 1588 - 1599 . （in Chinese） . doi: 10.37188/ope.20202807.1588 http://dx.doi.org/10.37188/ope.20202807.1588

朱祺琪，李真，张亚男，等 . 全局局部细节感知条件随机场的高分辨率遥感影像建筑物提取［J］. 遥感学报， 2021 ， 25 （ 7 ）： 1422 - 1433 .

ZHU Q Q ， LI ZH ， ZHANG Y N ， et al . Global-Local-Aware conditional random fields based building extraction for high spatial resolution remote sensing images ［J］. National Remote Sensing Bulletin ， 2021 ， 25 （ 7 ）： 1422 - 1433 . （in Chinese）

WU G M ， SHAO X W ， GUO Z L ， et al . Automatic building segmentation of aerial imagery using multi-constraint fully convolutional networks ［J］. Remote Sensing ， 2018 ， 10 （ 3 ）： 407 . doi: 10.3390/rs10030407 http://dx.doi.org/10.3390/rs10030407

JI S P ， WEI S Q ， LU M . Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set ［J］. IEEE Transactions on Geoscience and Remote Sensing ， 2019 ， 57 （ 1 ）： 574 - 586 . doi: 10.1109/tgrs.2018.2858817 http://dx.doi.org/10.1109/tgrs.2018.2858817

LIU P H ， LIU X P ， LIU M X ， et al . Building footprint extraction from high-resolution images via spatial residual inception convolutional neural network ［J］. Remote Sensing ， 2019 ， 11 （ 7 ）： 830 . doi: 10.3390/rs11070830 http://dx.doi.org/10.3390/rs11070830

KANG W C ， XIANG Y M ， WANG F ， et al . EU-net： an efficient fully convolutional network for building extraction from optical remote sensing images ［J］. Remote Sensing ， 2019 ， 11 （ 23 ）： 2813 . doi: 10.3390/rs11232813 http://dx.doi.org/10.3390/rs11232813

SUN K ， XIAO B ， LIU D ， et al . Deep high-resolution representation learning for human pose estimation ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1520，2019 ， Long Beach， CA， USA. IEEE ， 2020 ： 5686 - 5696 . doi: 10.1109/cvpr.2019.00584 http://dx.doi.org/10.1109/cvpr.2019.00584

BADRINARAYANAN V ， KENDALL A ， CIPOLLA R . SegNet： a deep convolutional encoder-decoder architecture for image segmentation ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2017 ， 39 （ 12 ）： 2481 - 2495 . doi: 10.1109/tpami.2016.2644615 http://dx.doi.org/10.1109/tpami.2016.2644615

NOH H ， HONG S ， HAN B . Learning deconvolution network for semantic segmentation ［C］. 2015 IEEE International Conference on Computer Vision （ICCV）. 713，2015 ， Santiago， Chile. IEEE ， 2016 ： 1520 - 1528 . doi: 10.1109/iccv.2015.178 http://dx.doi.org/10.1109/iccv.2015.178

DIAKOGIANNIS F I ， WALDNER F ， CACCETTA P ， et al . ResUNet-a： a deep learning framework for semantic segmentation of remotely sensed data ［J］. ISPRS Journal of Photogrammetry and Remote Sensing ， 2020 ， 162 ： 94 - 114 . doi: 10.1016/j.isprsjprs.2020.01.013 http://dx.doi.org/10.1016/j.isprsjprs.2020.01.013

HE K M ， ZHANG X Y ， REN S Q ， et al . Spatial pyramid pooling in deep convolutional networks for visual recognition ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2015 ， 37 （ 9 ）： 1904 - 1916 . doi: 10.1109/tpami.2015.2389824 http://dx.doi.org/10.1109/tpami.2015.2389824

ZHAO H S ， SHI J P ， QI X J ， et al . Pyramid scene parsing network ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2126，2017 ， Honolulu， HI， USA. IEEE ， 2017 ： 6230 - 6239 . doi: 10.1109/cvpr.2017.660 http://dx.doi.org/10.1109/cvpr.2017.660

HE K M ， ZHANG X Y ， REN S Q ， et al . Deep residual learning for image recognition ［C］. 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2730，2016 ， Las Vegas， NV， USA. IEEE ， 2016 ： 770 - 778 . doi: 10.1109/cvpr.2016.90 http://dx.doi.org/10.1109/cvpr.2016.90

LI H F ， QIU K J ， CHEN L ， et al . SCAttNet： semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images ［J］. IEEE Geoscience and Remote Sensing Letters ， 2021 ， 18 （ 5 ）： 905 - 909 . doi: 10.1109/lgrs.2020.2988294 http://dx.doi.org/10.1109/lgrs.2020.2988294

MNIH V . Machine Learning for Aerial Image Labeling ［M］. Canada ： University of Toronto ， 2013 .

LIU ， LUO ， HUANG ， et al . DE-net： deep encoding network for building extraction from high-resolution remote sensing imagery ［J］. Remote Sensing ， 2019 ， 11 （ 20 ）： 2380 . doi: 10.3390/rs11202380 http://dx.doi.org/10.3390/rs11202380

GUO H N ， SU X ， TANG S K ， et al . Scale-robust deep-supervision network for mapping building footprints from high-resolution remote sensing images ［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing ， 2021 ， 14 ： 10091 - 10100 . doi: 10.1109/jstars.2021.3109237 http://dx.doi.org/10.1109/jstars.2021.3109237

DENG W J ， SHI Q ， LI J . Attention-gate-based encoder-decoder network for automatical building extraction ［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing ， 2021 ， 14 ： 2611 - 2620 . doi: 10.1109/jstars.2021.3058097 http://dx.doi.org/10.1109/jstars.2021.3058097

CHEN Z Y ， LI D L ， FAN W T ， et al . Self-attention in reconstruction bias U-net for semantic segmentation of building rooftops in optical remote sensing images ［J］. Remote Sensing ， 2021 ， 13 （ 13 ）： 2524 . doi: 10.3390/rs13132524 http://dx.doi.org/10.3390/rs13132524

Views

692

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Remote sensing object detection algorithm based on ultra fusion residual marching geometric perception

Design of lightweight re-parameterized remote sensing image super-resolution network

Fusion of fractal geometric features Resnet remote sensing image building segmentation

Building segmentation in remote sensing image based on multiscale-feature fusion dilated convolution resnet

Color-line and dark channel based dehazing for remote sensing images

Related Author

WANG Haowen

WU Kaijun

BAI Xiaofeng

BAI Chenshuai

XIE Weijia

LI Jun

CAO Feng

CHEN Junkuan

Related Institution

School of Electronic and Information Engineering， Lanzhou Jiaotong University

College of Information Engineering， Jiangxi University of Science and Technology

College of Information and Control Engineering，Xi'an University of Architecture and Technology

Xi'an Key Labratory of Building Manufactaring Intelligent & Automation Technology

Faculty of Electronic and Information Engineering， Xi'an Jiaotong University

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰