Real-time semantic segmentation based on improved BiSeNet

REN Fenglei; YANG Lu; ZHOU Haibo; ZHANG Shiyv; HE Xin; XU Wenxue

doi:10.37188/OPE.20233108.1217

您当前的位置：

首页 >

文章列表页 >

Real-time semantic segmentation based on improved BiSeNet

Information Sciences | 更新时间：2023-05-04

- Real-time semantic segmentation based on improved BiSeNet
- Optics and Precision Engineering Vol. 31, Issue 8, Pages: 1217-1227(2023)
- 作者机构：
  
  1.天津理工大学天津市先进机电系统设计与智能控制重点实验室，天津 300384
  2.天津理工大学机电工程国家级实验教学示范中心，天津 300384
  3.中国科学院长春光学精密机械与物理研究所，吉林长春 130033
  4.天津卓越信通科技有限公司，天津 300384
- 作者简介：
- 基金信息：
- DOI：10.37188/OPE.20233108.1217
  CLC： TP394.1
- Received：12 September 2022，
  
  Revised：01 October 2022，
  
  Published：25 April 2023
- 稿件说明：
移动端阅览
任凤雷,杨璐,周海波等.基于改进BiSeNet的实时图像语义分割[J].光学精密工程,2023,31(08):1217-1227.

REN Fenglei,YANG Lu,ZHOU Haibo,et al.Real-time semantic segmentation based on improved BiSeNet[J].Optics and Precision Engineering,2023,31(08):1217-1227.
任凤雷,杨璐,周海波等.基于改进BiSeNet的实时图像语义分割[J].光学精密工程,2023,31(08):1217-1227. DOI： 10.37188/OPE.20233108.1217.

REN Fenglei,YANG Lu,ZHOU Haibo,et al.Real-time semantic segmentation based on improved BiSeNet[J].Optics and Precision Engineering,2023,31(08):1217-1227. DOI： 10.37188/OPE.20233108.1217.

摘要

为了提升图像语义分割算法的性能，使其同时满足准确性和实时性需求，本文提出了一种基于改进BiSeNet的实时图像语义分割算法。首先，通过使双分支网络头部共享以消除BiSeNet网络结构部分通道和参数的冗余，同时有效提取图像的浅层特征；然后，将上述共享网络拆分为由细节分支和语义分支组成的双分支网络，并分别用于提取空间细节信息和语义上下文信息；此外，在语义分支尾部引入通道和空间注意力机制以增强特征表达能力，通过使用双注意力机制对BiSeNet算法进行优化以更有效地提取语义上下文特征；最后，对细节分支和语义分支的特征进行融合并通过上采样操作恢复至输入图像分辨率大小以实现图像语义分割。本文算法在Cityscapes数据集以95.3FPS的实时性表现达到77.2% mIoU的准确性；在CamVid数据集以179.1 FPS的实时性表现达到73.8% mIoU的准确性。实验结果表明，本文算法在实时性和准确性方面获得了很好的平衡，其语义分割性能相较于BiSeNet算法及其它现有算法得到了显著的提升。

Abstract

To improve the performance of image semantic segmentation on accuracy and efficiency for practical applications， in this study， we propose a real-time semantic segmentation algorithm based on improved BiSeNet. First， the redundancy of certain channels and parameters of BiSeNet is eliminated by sharing the heads of dual branches， and the affluent shallow features are effectively extracted at the same time. Subsequently， the shared layers are divided into dual branches， namely， the detail branch and the semantic branch， which are used to extract detailed spatial information and contextual semantic information， respectively. Furthermore， both the channel attention mechanism and spatial attention mechanism are introduced into the tail of the semantic branch to enhance the feature representation； thus the BiSeNet is optimized by using dual attention mechanisms to extract contextual semantic features more effectively. Finally， the features of the detail branch and semantic branch are fused and up-sampled to the resolution of the input image to obtain semantic segmentation. Our proposed algorithm achieves 77.2% mIoU on accuracy with real-time performance of 95.3 FPS on Cityscapes dataset and 73.8% mIoU on accuracy with real-time performance of 179.1 FPS on CamVid dataset. The experiments demonstrate that our proposed semantic segmentation algorithm achieves a good trade-off between accuracy and efficiency. Furthermore， the performance of semantic segmentation is significantly improved compared with BiSeNet and other existing algorithms.

关键词

Keywords

references

王中宇，倪显扬，尚振东 . 利用卷积神经网络的自动驾驶场景语义分割［J］. 光学精密工程， 2019 ， 27 （ 11 ）： 2429 - 2438 . doi: 10.3788/ope.20192711.2429 http://dx.doi.org/10.3788/ope.20192711.2429

WANG ZH Y ， NI X Y ， SHANG ZH D . Autonomous driving semantic segmentation with convolution neural networks ［J］. Opt. Precision Eng. ， 2019 ， 27 （ 11 ）： 2429 - 2438 . （in Chinese） . doi: 10.3788/ope.20192711.2429 http://dx.doi.org/10.3788/ope.20192711.2429

RONNEBERGER O ， FISCHER P ， BROX T . U-Net ： Convolutional Networks For Biomedical Image Segmentation ［M］. Lecture Notes in Computer Science . Cham ： Springer International Publishing ， 2015 ： 234 - 241 . doi: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28

CHEN J ， LU Y ， YU Q ， et al . TransUNet ： Transformers Make Strong Encoders For Medical Image Segmentation ［EB/OL］. 2021 ： arXiv ： 2102 . 04306 . https：//arxiv.org/abs/2102.04306 https://arxiv.org/abs/2102.04306

王雅男，王挺峰，田玉珍，等 . 基于改进的局部表面凸性算法三维点云分割［J］. 中国光学， 2017 ， 10 （ 3 ）： 348 - 354 . doi: 10.3788/co.20171003.0348 http://dx.doi.org/10.3788/co.20171003.0348

WANG Y N ， WANG T F ， TIAN Y Z ， et al . Improved local convexity algorithm of segmentation for 3D point cloud ［J］. Chinese Optics ， 2017 ， 10 （ 3 ）： 348 - 354 . （in Chinese） . doi: 10.3788/co.20171003.0348 http://dx.doi.org/10.3788/co.20171003.0348

REN F L ， ZHOU H B ， YANG L ， et al . ADPNet： Attention based dual path network for lane detection ［J］. Journal of Visual Communication and Image Representation ， 2022 ， 87 ： 103574 . doi: 10.1016/j.jvcir.2022.103574 http://dx.doi.org/10.1016/j.jvcir.2022.103574

LONG J ， SHELHAMER E ， DARRELL T . Fully Convolutional Networks for Semantic Segmentation ［C］. 2015 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 712，2015 ， Boston， MA， USA. IEEE ， 2015 ： 3431 - 3440 . doi: 10.1109/cvpr.2015.7298965 http://dx.doi.org/10.1109/cvpr.2015.7298965

CHEN L C ， PAPANDREOU G ， KOKKINOS I ， et al . DeepLab： semantic image segmentation with deep convolutional nets， atrous convolution， and fully connected CRFs ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2018 ， 40 （ 4 ）： 834 - 848 . doi: 10.1109/tpami.2017.2699184 http://dx.doi.org/10.1109/tpami.2017.2699184

CHEN L C ， PAPANDREOU G ， SCHROFF F ， et al . Rethinking Atrous Convolution for Semantic Image Segmentation ［EB/OL］. 2017 ： arXiv ： 1706 . 05587 . https：//arxiv.org/abs/1706.05587 https://arxiv.org/abs/1706.05587 . doi: 10.1007/978-3-030-01234-2_49 http://dx.doi.org/10.1007/978-3-030-01234-2_49

CHEN L C ， ZHU Y K ， PAPANDREOU G ， et al . Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation ［M］. Computer Vision-ECCV 2018 . Cham ： Springer International Publishing ， 2018 ： 833 - 851 . doi: 10.1007/978-3-030-01234-2_49 http://dx.doi.org/10.1007/978-3-030-01234-2_49

任凤雷，何昕，魏仲慧，等 . 基于DeepLabV3+与超像素优化的语义分割［J］. 光学精密工程， 2019 ， 27 （ 12 ）： 2722 - 2729 . doi: 10.3788/ope.20192712.2722 http://dx.doi.org/10.3788/ope.20192712.2722

REN F L ， HE X ， WEI ZH H ， et al . Semantic segmentation based on DeepLabV3+ and superpixel optimization ［J］. Opt. Precision Eng. ， 2019 ， 27 （ 12 ）： 2722 - 2729 . （in Chinese） . doi: 10.3788/ope.20192712.2722 http://dx.doi.org/10.3788/ope.20192712.2722

WANG J D ， SUN K ， CHENG T H ， et al . Deep high-resolution representation learning for visual recognition ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2021 ， 43 （ 10 ）： 3349 - 3364 . doi: 10.1109/tpami.2020.2983686 http://dx.doi.org/10.1109/tpami.2020.2983686

PASZKE A ， CHAURASIA A ， KIM S ， et al . ENet ： A Deep Neural Network Architecture for Real-Time Semantic Segmentation ［EB/OL］. 2016 ： arXiv ： 1606 . 02147 . https：//arxiv.org/abs/1606.02147 https://arxiv.org/abs/1606.02147

WANG H C ， JIANG X L ， REN H B ， et al . SwiftNet： real-time video object segmentation ［C］. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 2025，2021 ， Nashville， TN， USA. IEEE ， 2021 ： 1296 - 1305 . doi: 10.1109/cvpr46437.2021.00135 http://dx.doi.org/10.1109/cvpr46437.2021.00135

LI H C ， XIONG P F ， FAN H Q ， et al . DFANet： Deep Feature Aggregation for Real-Time Semantic Segmentation ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1520，2019 ， Long Beach， CA， USA. IEEE ， 2020 ： 9514 - 9523 . doi: 10.1109/cvpr.2019.00975 http://dx.doi.org/10.1109/cvpr.2019.00975

FAN M Y ， LAI S Q ， HUANG J S ， et al . Rethinking BiSeNet for Real-Time Semantic Segmentation ［C］. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 2025，2021 ， Nashville， TN， USA. IEEE ， 2021 ： 9711 - 9720 . doi: 10.1109/cvpr46437.2021.00959 http://dx.doi.org/10.1109/cvpr46437.2021.00959

HU P ， PERAZZI F ， HEILBRON F C ， et al . Real-time semantic segmentation with fast attention ［J］. IEEE Robotics and Automation Letters ， 2021 ， 6 （ 1 ）： 263 - 270 . doi: 10.1109/lra.2020.3039744 http://dx.doi.org/10.1109/lra.2020.3039744

YU C Q ， WANG J B ， PENG C ， et al . BiSeNet ： Bilateral Segmentation Network for Real-Time Semantic Segmentation ［M］. Computer Vision - ECCV 2018 . Cham ： Springer International Publishing ， 2018 ： 334 - 349 . doi: 10.1007/978-3-030-01261-8_20 http://dx.doi.org/10.1007/978-3-030-01261-8_20

YU C Q ， GAO C X ， WANG J B ， et al . BiSeNet V2： bilateral network with guided aggregation for real-time semantic segmentation ［J］. International Journal of Computer Vision ， 2021 ， 129 （ 11 ）： 3051 - 3068 . doi: 10.1007/s11263-021-01515-2 http://dx.doi.org/10.1007/s11263-021-01515-2

HONG Y ， PAN H ， SUN W ， et al . Deep Dual-Resolution Networks for Real-Time and Accurate Semantic Segmentation of Road Scenes ［EB/OL］. 2021 ： arXiv ： 2101 . 06085 . https：//arxiv.org/abs/2101.06085 https://arxiv.org/abs/2101.06085

POUDEL R P K ， LIWICKI S ， CIPOLLA R . Fast-SCNN ： Fast Semantic Segmentation Network ［EB/OL］. 2019 ： arXiv ： 1902 . 04502 . https：//arxiv.org/abs/1902.04502 https://arxiv.org/abs/1902.04502

ZHAO H S ， SHI J P ， QI X J ， et al . Pyramid Scene Parsing Network ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2126，2017 ， Honolulu， HI， USA. IEEE ， 2017 ： 6230 - 6239 . doi: 10.1109/cvpr.2017.660 http://dx.doi.org/10.1109/cvpr.2017.660

KUMAAR S ， LYU Y ， NEX F ， et al . CABiNet： Efficient Context Aggregation Network for Low-Latency Semantic Segmentation ［C］. 2021 IEEE International Conference on Robotics and Automation （ICRA）. 305，2021 ， Xi'an， China. IEEE ， 2021 ： 13517 - 13524 . doi: 10.1109/icra48506.2021.9560977 http://dx.doi.org/10.1109/icra48506.2021.9560977

HE K M ， ZHANG X Y ， REN S Q ， et al . Deep Residual Learning for Image Recognition ［C］. 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. June 27 - 30 ， 2016 . Las Vegas， NV， USA. IEEE ， 2016 ： 770 - 778 .

SANDLER M ， HOWARD A ， ZHU M L ， et al . MobileNetV2： Inverted Residuals and Linear Bottlenecks ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. June 18 - 23 ， 2018 . Salt Lake City， UT. IEEE ， 2018 ： 4510 - 4520 .

ZHANG X Y ， ZHOU X Y ， LIN M X ， et al . ShuffleNet： an extremely efficient convolutional neural network for mobile devices ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 6848 - 6856 . doi: 10.1109/cvpr.2018.00716 http://dx.doi.org/10.1109/cvpr.2018.00716

CHOLLET F . Xception： deep learning with depthwise separable convolutions ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2126，2017 ， Honolulu， HI， USA. IEEE ， 2017 ： 1800 - 1807 . doi: 10.1109/cvpr.2017.195 http://dx.doi.org/10.1109/cvpr.2017.195

VASWANI A ， SHAZEER N ， PARMAR N ， et al . Attention is all you need ［J］. Advances in neural information processing systems ， 2017 ， 30 .

HU J ， SHEN L ， ALBANIE S ， et al . Squeeze-and-excitation networks ［C］. IEEE Transactions on Pattern Analysis and Machine Intelligence . 29，2019 ， IEEE ， 2019 ： 2011 - 2023 . doi: 10.1109/tpami.2019.2913372 http://dx.doi.org/10.1109/tpami.2019.2913372

IOFFE S ， SZEGEDY C . Batch normalization： accelerating deep network training by reducing internal covariate shift ［C］. Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37. New York ： ACM ， 2015 ： 448 - 456 .

GLOROT X ， BORDES A ， BENGIO Y . Deep sparse rectifier neural networks ［C］. Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings ， 2011 ： 315 - 323 .

FU J ， LIU J ， TIAN H J ， et al . Dual attention network for scene segmentation ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1520，2019 ， Long Beach， CA， USA. IEEE ， 2020 ： 3141 - 3149 . doi: 10.1109/cvpr.2019.00326 http://dx.doi.org/10.1109/cvpr.2019.00326

CORDTS M ， OMRAN M ， RAMOS S ， et al . The cityscapes dataset for semantic urban scene understanding ［C］. 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR） . IEEE ， 2016 ： 3213 - 3223 . doi: 10.1109/cvpr.2016.350 http://dx.doi.org/10.1109/cvpr.2016.350

BROSTOW GJ ， FAUQUEUR J ， CIPOLLA R . Semantic object classes in video： a high-definition ground truth database ［J］. Pattern Recognition Letters ， 2009 ， 30 （ 2 ）： 88 - 97 . doi: 10.1016/j.patrec.2008.04.005 http://dx.doi.org/10.1016/j.patrec.2008.04.005

BADRINARAYANAN V ， KENDALL A ， CIPOLLA R . SegNet： a deep convolutional encoder-decoder architecture for image segmentation ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2017 ， 39 （ 12 ）： 2481 - 2495 . doi: 10.1109/tpami.2016.2644615 http://dx.doi.org/10.1109/tpami.2016.2644615

王曦，于鸣，任洪娥 . UNET与FPN相结合的遥感图像语义分割［J］. 液晶与显示， 2021 ， 36 （ 3 ）： 475 - 483 . doi: 10.37188/CJLCD.2020-0116 http://dx.doi.org/10.37188/CJLCD.2020-0116

WANG X ， YU M ， REN H E . Remote sensing image semantic segmentation combining UNET and FPN ［J］. Chinese Journal of Liquid Crystals and Displays ， 2021 ， 36 （ 3 ）： 475 - 483 . （in Chinese） . doi: 10.37188/CJLCD.2020-0116 http://dx.doi.org/10.37188/CJLCD.2020-0116

沈言善，王阿川 . 基于深度学习的遥感图像地物分割方法［J］. 液晶与显示， 2021 ， 36 （ 5 ）： 733 - 740 . doi: 10.37188/CJLCD.2020-0294 http://dx.doi.org/10.37188/CJLCD.2020-0294

SHEN Y SH ， WANG A CH . Remote sensing image feature segmentation method based on deep learning ［J］. Chinese Journal of Liquid Crystals and Displays ， 2021 ， 36 （ 5 ）： 733 - 740 . （in Chinese） . doi: 10.37188/CJLCD.2020-0294 http://dx.doi.org/10.37188/CJLCD.2020-0294

Views

1022

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Real time semantic segmentation network of wire harness terminals based on multiple receptive field attention

Underwater image enhancement based on multi-branch residual attention network

Concrete crack segmentation combined with linear guidance and mesh optimization

Photovoltaic hot spot detection method incorporating knowledge distillation and attention mechanisms

Face recognition algorithm incorporating CBAM and Siamese neural network

Related Author

GU Yanan

CAO Ruyi

ZHAO Lishan

LU Bibo

SU Baishun

CHENG Zhuming

LI Jiaxuan

HUANG San'ao

Related Institution

College of Computer Science and Technology， Henan Polytechnic University

School of Electrical and Information Engineering， Anhui University of Technology， Maanshan

College of Information and Control Engineering，Xi'an University of Architecture and Technology

Xi'an Key Laboratory of Intelligent Technology for Building and Manufacturing

Higher Education Key Laboratory of Construction Robot in Shaanxi Province

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰