基于DeepLabV3+与超像素优化的语义分割

任凤雷; 何昕; 魏仲慧; 吕游; 李沐雨

doi:10.3788/OPE.20192712.2722

您当前的位置：

首页 >

文章列表页 >

基于DeepLabV3+与超像素优化的语义分割

信息科学 | 更新时间：2020-07-09

- 基于DeepLabV3+与超像素优化的语义分割
- Semantic segmentation based on DeepLabV3+ and superpixel optimization
- 光学精密工程 2019年27卷第12期页码：2722-2729
- 作者机构：
  
  1.中国科学院长春光学精密机械与物理研究所, 吉林长春 130033
  2.中国科学院大学, 北京 100049
- 作者简介：
  
  [ "任凤雷(1991-)，男，河北沧州人，博士研究生，2015年于吉林大学获得学士学位，主要从事数字图像处理、自动驾驶方面的研究。E-mail：renfenglei15@mails.ucas.edu.cn" ]
  [ "何昕(1966-)，男，吉林长春人，研究员，博士研究生导师，1988年于哈尔滨工业大学获得学士学位，1991年于长春光机所获得硕士学位，主要从事图像处理、光电测量等方面的研究。E-mail：hexin6627@sohu.com" ]
  [ "吕游(1988-)，男，吉林松原人，助理研究员，2011年于吉林大学获得学士学位，2016年于长春光机所获得博士学位，主要从事目标特性测量、自主导航技术方面的研究。E-mail：lvyou8863@163.com" ]
- 基金信息：
  
  吉林省科技发展计划资助项目(20180201013GX)
- DOI：10.3788/OPE.20192712.2722
  中图分类号： TP394.1
- 收稿日期：2019-06-24，
  
  录用日期：2019-8-17，
  
  纸质出版日期：2019-12-25
- 稿件说明：
移动端阅览
任凤雷, 何昕, 魏仲慧, 等. 基于DeepLabV3+与超像素优化的语义分割[J]. 光学精密工程, 2019,27(12):2722-2729.

Feng-lei REN, Xin HE, Zhong-hui WEI, et al. Semantic segmentation based on DeepLabV3+ and superpixel optimization[J]. Optics and precision engineering, 2019, 27(12): 2722-2729.
任凤雷, 何昕, 魏仲慧, 等. 基于DeepLabV3+与超像素优化的语义分割[J]. 光学精密工程, 2019,27(12):2722-2729. DOI： 10.3788/OPE.20192712.2722.

Feng-lei REN, Xin HE, Zhong-hui WEI, et al. Semantic segmentation based on DeepLabV3+ and superpixel optimization[J]. Optics and precision engineering, 2019, 27(12): 2722-2729. DOI： 10.3788/OPE.20192712.2722.

摘要

针对基于深度学习的DeepLabV3+语义分割算法在编码特征提取阶段大量细节信息被丢失，导致其在物体边缘部分分割效果不佳的问题，本文提出了基于DeepLabV3+与超像素优化的语义分割算法。首先，使用DeepLabV3+模型提取图像语义特征并得到粗糙的语义分割结果；然后，使用SLIC超像素分割算法将输入图像分割成超像素图像；最后，融合高层抽象的语义特征和超像素的细节信息，得到边缘优化的语义分割结果。在PASCAL VOC 2O12数据集上的实验表明，相比较DeepLabV3+语义分割算法，本文算法在物体边缘等细节部分有着更好的语义分割性能，其mIoU值达到83.8%，性能得到显著提高并达到了目前领先的水平。

Abstract

To tackle the problem where by DeepLabV3+ loses considerable detail information during feature extraction

which leads to poor segmentation results in the edges of the objects

this study proposed a semantics segmentation algorithm based on DeepLabV3+ and optimized by superpixels. First

a DeepLabV3+ model was chosen to extract semantic features and obtain coarse semantic segmentation results. Then

the simple linear iterative clustering algorithm was used to segment the input image into superpixels. Finally

high-level abstract semantic features and detailed information of the superpixels were fused to obtain edge optimized semantic segmentation results. Experiments conducted on the PASCAL VOC 2O12 dataset show that compared to DeepLabV3+

the proposed algorithm had superior performance in terms of detail parts such as edges of objects

and the value of mIoU reached 83.8%.The proposed algorithm thus outperformed other state-of-the-art algorithms in terms of semantic segmentation.

关键词

Keywords

references

LADICKY L, SHI J, POLLEFEYS M.Pulling things out of perspective[C]. Proceedings of the IEEE conference on computer vision and pattern recognition , 2014: 89-96.

XIAO J, QUAN L.Multiple view semantic segmentation for street view images[C]. 2009 IEEE 12th international conference on computer vision , 2009: 686-693.

SHOTTON J, JOHNSON M, CIPOLLA R.Semantic texton forests for image categorization and segmentation[C]. 2008 IEEE Conference on Computer Vision and Pattern Recognition , 2008: 1-8.

TU Z, BAI X. Auto-context and its application to high-level vision tasks and 3d brain image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence , 2009, 32(10):1744-1757.

FULKERSON B, VEDALDI A, SOATTO S.Class segmentation and object localization with superpixel neighborhoods[C]. 2009 IEEE 12th international conference on computer vision , 2009: 670-677.

KRIZHENVSHKY A, SUTSKEVER I, HINTON G.Imagenet classification with deep convolutional networks[C]. Proceedings of the Conference Neural Information Processing Systems (NIPS) , 1097-1105.

WU Z, SHEN C, VAN DEN HENGEL A. Wider or deeper: Revisiting the resnet model for visual recognition[J]. Pattern Recognition , 2019, 90:119-133.

李宇, 刘雪莹, 张洪群, 等.基于卷积神经网络的光学遥感图像检索[J].光学精密工程, 2018, 26(1):200-207.

LI Y, LIU X Y, ZHANG H Q, et al .. Optical remote sensing image retrieval based of convolutional neural networks[J]. Opt. Precision Eng. , 2018, 26(1):200-207. (in Chinese)

SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv: 1409.1556, 2014.

方明, 孙腾腾, 邵桢.基于改进YOLOv2的快速安全帽佩戴情况检测[J].光学精密工程, 2019, 27(5), 1196-1205.

FANG M, SUN T T, SHAO Z.Rapid helmet wear detection based on improved YOLOv2[J]. Opt. Precision Eng. , 2019, 27(5), 1196-1205. (in Chinese)

LONG J, SHELHAMER E, DARRELL T.Fully convolutional networks for semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition , 2015: 3431-3440.

RONNEBERGER O, FISCHER P, BROX T.U-net: Convolutional networks for biomedical image segmentation[C]. International Conference on Medical image computing and computer-assisted intervention , 2015: 234-241.

BADRINARAYANAN V, KENDALL A, CIPOLLA R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence , 2017, 39(12):2481-2495.

CHEN L-C, PAPANDREOU G, KOKKINOS I, et al.. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE transactions on pattern analysis and machine intelligence , 2017, 40(4):834-848.

CHEN L-C, PAPANDREOU G, SCHROFF F, et al.. Rethinking atrous convolution for semantic image segmentation[J]. arXiv preprint arXiv: 1706.05587, 2017.

CHEN L-C, ZHU Y, PAPANDREOU G, et al.. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]. Proceedings of the European Conference on Computer Vision (ECCV) , 2018: 801-818.

REN F, HE X, WER Z, et al.. Fusing appearance and prior cues for road detection[J]. Applied Sciences , 2019, 9(5):996.

WEI Z, YI F, WEI X, et al.. An improved image semantic segmentation method based on superpixels and conditional random fields[J]. Applied Science s, 2018, 8(5):837.

HE K, ZHANG X, REN S, et al.. Deep residual learning for image recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition , 2016: 770-778.

ACHANTA R, SHAJI A, SMITH K, et al.. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence , 2012, 34(11):2274-2282.

LIU W, RABINOVICH A, BERG AC. Parsenet: Looking wider to see better[J]. arXiv preprint arXiv: 1506.04579, 2015.

EVERINGHAM M, VAN GOOL L, WILLIAMS CK, et al.. The pascal visual object classes (voc) challenge[J]. International journal of computer vision , 2010, 88(2):303-338.

HARIHARAN B, ARBELAEZ P, BOURDEV L, et al.. Semantic contours from inverse detectors[C]. 2011 International Conference on Computer Vision , 2011: 991-998.

浏览量

171

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

融合注意力机制的改进型DeepLabv3+语义分割

基于改进BiSeNet的实时图像语义分割

多尺度注意力线束端子实时语义分割网络

基于多分支残差注意力网络的水下图像增强