跨尺度跨维度的自适应Transformer网络应用于结直肠息肉分割

梁礼明; 何安军; 李仁杰; 吴健

doi:10.37188/OPE.20233118.2700

您当前的位置：

首页 >

文章列表页 >

跨尺度跨维度的自适应Transformer网络应用于结直肠息肉分割

信息科学 | 更新时间：2023-09-25

- 跨尺度跨维度的自适应Transformer网络应用于结直肠息肉分割
- Cross-scale and cross-dimensional adaptive transformer network for colorectal polyp segmentation
- 光学精密工程 2023年31卷第18期页码：2700-2712
- 作者机构：
  
  江西理工大学电气工程及其自动化学院，江西赣州 341000
- 作者简介：
  
  [ "梁礼明（1967-），男，江西吉安人，教授，硕士，江西省高等学校中青年骨干教师，主要从事机器学习和医学影像方面的研究。E-mail：9119890012 @jxust.edu.cn" ]
  [ "吴健（1991-），男，江西赣州人，硕士，讲师。主要从事医学影像和机器学习等方面研究。E-mail：wujian@jxust.edu.cn" ]
- 基金信息：
  
  国家自然科学基金资助项目(51365017;61463018);江西省自然科学基金面上项目资助(20192BAB205084);江西省教育厅科学技术研究重点项目资助(GJJ170491;GJJ2200848)
- DOI：10.37188/OPE.20233118.2700
  中图分类号： TP391.4
- 收稿日期：2023-03-15，
  
  修回日期：2023-04-15，
  
  纸质出版日期：2023-09-25
- 稿件说明：
移动端阅览
梁礼明,何安军,李仁杰等.跨尺度跨维度的自适应Transformer网络应用于结直肠息肉分割[J].光学精密工程,2023,31(18):2700-2712.

LIANG Liming,HE Anjun,LI Renjie,et al.Cross-scale and cross-dimensional adaptive transformer network for colorectal polyp segmentation[J].Optics and Precision Engineering,2023,31(18):2700-2712.
梁礼明,何安军,李仁杰等.跨尺度跨维度的自适应Transformer网络应用于结直肠息肉分割[J].光学精密工程,2023,31(18):2700-2712. DOI： 10.37188/OPE.20233118.2700.

LIANG Liming,HE Anjun,LI Renjie,et al.Cross-scale and cross-dimensional adaptive transformer network for colorectal polyp segmentation[J].Optics and Precision Engineering,2023,31(18):2700-2712. DOI： 10.37188/OPE.20233118.2700.

摘要

针对结直肠息肉图像病灶区域尺度变化大、边界模糊、形状不规则且与正常组织对比度低等问题，导致边缘细节信息丢失和病灶区域误分割，提出一种跨尺度跨维度的自适应Transformer分割网络。该网络一是利用Transformer编码器建模输入图像的全局上下文信息，多尺度分析结直肠息肉病灶区域。二是通过通道注意力桥和空间注意力桥减少通道维度冗余和增强模型空间感知能力，抑制背景噪声。三是采用多尺度密集并行解码模块来填补各层跨尺度特征信息之间的语义空白，有效聚合多尺度上下文特征。四是设计面向边缘细节的多尺度预测模块，以可学习的方式引导网络去纠正边界错误预测分类。在CVC-ClinicDB、Kvasir-SEG、CVC-ColonDB和ETIS数据集上进行实验，其Dice相似性系数分别为0.942，0.932，0.811和0.805，平均交并比分别为0.896，0.883，0.731和0.729，其分割性能优于现有方法。仿真实验表明，本文方法能有效改善结直肠息肉病灶区域误分割，具有较高的分割精度，为结直肠息肉诊断提供新窗口。

Abstract

To address the problem of large-scale variation， blurred boundaries， irregular shapes， and low contrast with normal tissues in colon polyp images， which leads to the loss of edge detail information and mis-segmentation of lesion areas， we propose a cross-dimensional and cross-scale adaptive transformer segmentation network. First， the network uses transformer encoders to model the global contextual information of the input image and analyze the colon polyp lesion areas at multiple scales. Second， the channel attention and spatial attention bridges are used to reduce channel dimension redundancy and enhance the model's spatial perception ability while suppressing background noise. Third， the multi-scale dense parallel decoding module is used to bridge the semantic gaps between cross-scale feature information at different layers， effectively aggregating multi-scale contextual features. Fourth， a multi-scale prediction module is designed for edge details， guiding the network to correct boundary errors in a learnable manner. The experimental results conducted on the CVC-ClinicDB， Kvasir-SEG， CVC-ColonDB， and ETIS datasets showed that the Dice similarity coefficients are 0.942， 0.932， 0.811， and 0.805， and the average intersection-over-union ratios are 0.896， 0.883， 0.731， and 0.729， respectively. The segmentation performance of our proposed method was better than that of existing methods. The simulation experiment showed that our method can effectively improve the mis-segmentation of colon polyp lesion areas and achieve high segmentation accuracy， providing a new approach for colon polyp diagnosis.

关键词

Keywords

references

LI W ， ZHAO Y ， LI F ， et al . MIA-Net： multi-information aggregation network combining transformers and convolutional feature learning for polyp segmentation ［J］. Knowledge-Based Systems ， 2022 ， 247 ： 108824 . doi: 10.1016/j.knosys.2022.108824 http://dx.doi.org/10.1016/j.knosys.2022.108824

徐昌佳，易见兵，曹锋，等 . 采用DoubleUNet网络的结直肠息肉分割算法［J］. 光学精密工程， 2022 ， 30 （ 8 ）： 970 - 983 . doi: 10.37188/ope.20223008.0970 http://dx.doi.org/10.37188/ope.20223008.0970

XU C J ， YI J B ， CAO F ， et al . Colorectal polyp segmentation algorithm using DoubleUNet network ［J］. Opt. Precision Eng. ， 2022 ， 30 （ 8 ）： 970 - 983 . （in Chinese） . doi: 10.37188/ope.20223008.0970 http://dx.doi.org/10.37188/ope.20223008.0970

梁礼明，周珑颂，冯骏，等 . 基于高分辨率复合网络的皮肤病变分割［J］. 光学精密工程， 2022 ， 30（ 16 ） 2021 - 2038 . doi: 10.37188/OPE.20223016.2021 http://dx.doi.org/10.37188/OPE.20223016.2021

LIANG L M ， ZHOU L S ， FENG J ， et al . Skin lesion segmentation based on high-resolution composite network ［J］. Opt. Precision Eng. ， 2022 ， 30（ 16 ） 2021 - 2038 （in Chinese） . doi: 10.37188/OPE.20223016.2021 http://dx.doi.org/10.37188/OPE.20223016.2021

LIANG H ， CHENG Z ， ZHONG H ， et al . A region-based convolutional network for nuclei detection and segmentation in microscopy images ［J］. Biomedical Signal Processing and Control ， 2022 ， 71 ： 103276 . doi: 10.1016/j.bspc.2021.103276 http://dx.doi.org/10.1016/j.bspc.2021.103276

SHAO D G ， XU C R ， XIANG Y ， et al . Ultrasound image segmentation with multilevel threshold based on differential search algorithm ［J］. IET Image Processing ， 2019 ， 13 （ 6 ）： 998 - 1005 . doi: 10.1049/iet-ipr.2018.6150 http://dx.doi.org/10.1049/iet-ipr.2018.6150

周明全，杨稳，林芃樾，等 . 基于最小二乘正则相关性分析的颅骨身份识别［J］. 光学精密工程， 2021 ， 29 （ 01 ）： 201 - 210 . doi: 10.37188/OPE.20212901.0201 http://dx.doi.org/10.37188/OPE.20212901.0201

ZHOU M Q ， YANG W ， LIN P Y ， et al . Skull identification based on least square canonical correlation analysis ［J］. Opt. Precision Eng. ， 2021 ， 29 （ 1 ）： 201 - 210 . （in Chinese） . doi: 10.37188/OPE.20212901.0201 http://dx.doi.org/10.37188/OPE.20212901.0201

梁礼明，刘博文，杨海龙，等 . 基于多特征融合的有监督视网膜血管提取［J］. 计算机学报， 2018 ， 41 （ 11 ）： 2566 - 2580 . doi: 10.11897/SP.J.1016.2018.02566 http://dx.doi.org/10.11897/SP.J.1016.2018.02566

LIANG L M ， LIU B W ， YANG H L ， et al . Supervised blood vessel extraction in retinal images based on multiple feature fusion ［J］. Chinese Journal of Computers ， 2018 ， 41 （ 11 ）： 2566 - 2580 . （in Chinese） . doi: 10.11897/SP.J.1016.2018.02566 http://dx.doi.org/10.11897/SP.J.1016.2018.02566

RONNEBERGER O ， FISCHER P ， BROX T . U-Net ： Convolutional Networks for Biomedical Image Segmentation ［M］. Lecture Notes in Computer Science . Cham ： Springer International Publishing ， 2015 ： 234 - 241 . doi: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28

JHA D ， SMEDSRUD P H ， RIEGLER M A ， et al . ResUNet： an advanced architecture for medical image segmentation ［C］. 2019 IEEE International Symposium on Multimedia （ISM） . 9 - 11 ， 2019， San Diego， CA， USA. IEEE ， 2020： 225 - 2255 . doi: 10.1109/ism46123.2019.00049 http://dx.doi.org/10.1109/ism46123.2019.00049

HU J ， SHEN L ， SUN G . Squeeze-and-excitation networks ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 18 - 23 ， 2018， Salt Lake City， UT， USA. IEEE ， 2018： 7132 - 7141 . doi: 10.1109/cvpr.2018.00745 http://dx.doi.org/10.1109/cvpr.2018.00745

FAN D ， JI G ， ZHOU T ， et al . PraNet： Parallel Reverse Attention Network for Polyp Segmentation ［EB/OL］. 2020 ： arXiv ： 2006 . 11392 . https：//arxiv.org/abs/2006.11392.pdf https://arxiv.org/abs/2006.11392.pdf . doi: 10.1007/978-3-030-59725-2_26 http://dx.doi.org/10.1007/978-3-030-59725-2_26

LOU A G ， GUAN S Y ， KO H ， et al . CaraNet： context axial reverse attention network for segmentation of small medical objects ［C］. SPIE Medical Imaging. Proc SPIE 12032 ， Medical Imaging 2022： Image Processing， San Diego， California， USA . 2022 ， 12032 ： 81 - 92 . doi: 10.1117/12.2611802 http://dx.doi.org/10.1117/12.2611802

VASWANI A ， SHAZEER N ， PARMAR N ， et al . Attention is All You Need ［EB/OL］. 2017 ： arXiv ： 1706 . 03762 . https：//arxiv.org/abs/1706.03762.pdf https://arxiv.org/abs/1706.03762.pdf

DAI Y ， GAO Y F ， LIU F Y . TransMed： transformers advance multi-modal medical image classification ［J］. Diagnostics （Basel， Switzerland）， 2021 ， 11 （ 8 ）： 1384 . doi: 10.3390/diagnostics11081384 http://dx.doi.org/10.3390/diagnostics11081384

CHEN J ， LU Y ， YU Q ， et al . TransUNet： Transformers Make Strong Encoders for Medical Image Segmentation ［EB/OL］. 2021 ： arXiv ： 2102 . 04306 . https：//arxiv.org/abs/2102.04306.pdf https://arxiv.org/abs/2102.04306.pdf

WANG J F ， HUANG Q M ， TANG F L ， et al . Stepwise Feature Fusion ： Local Guides Global ［M］. Lecture Notes in Computer Science . Cham ： Springer Nature Switzerland ， 2022 ： 110 - 120 . doi: 10.1007/978-3-031-16437-8_11 http://dx.doi.org/10.1007/978-3-031-16437-8_11

WU C ， LONG C ， LI S ， et al . MSRAformer： Multiscale spatial reverse attention network for polyp segmentation ［J］. Computers in Biology and Medicine ， 2022 ， 151 ： 106274 . doi: 10.1016/j.compbiomed.2022.106274 http://dx.doi.org/10.1016/j.compbiomed.2022.106274

WANG W H ， XIE E Z ， LI X ， et al . PVT v2： improved baselines with pyramid vision transformer ［J］. Computational Visual Media ， 2022 ， 8 （ 3 ）： 415 - 424 . doi: 10.1007/s41095-022-0274-8 http://dx.doi.org/10.1007/s41095-022-0274-8

ISLAM M A ， JIA S ， BRUCE N D B . How Much Position Information Do Convolutional Neural Networks Encode？［EB/OL］. 2020 ： arXiv ： 2001 . 08248 . https：//arxiv.org/abs/2001.08248.pdf https://arxiv.org/abs/2001.08248.pdf

RUAN J C ， XIANG S C ， XIE M Y ， et al . MALUNet： a multi-attention and light-weight UNet for skin lesion segmentation ［C］. 2022 IEEE International Conference on Bioinformatics and Biomedicine （BIBM） . 6 - 8 ， 2022， Las Vegas， NV， USA. IEEE ， 2023： 1150 - 1156 . doi: 10.1109/bibm55620.2022.9995040 http://dx.doi.org/10.1109/bibm55620.2022.9995040

DONG B ， WANG W ， FAN D ， et al . Polyp-PVT： Polyp Segmentation with Pyramid Vision Transformers ［EB/OL］. 2021 ： arXiv ： 2108 . 06932 . https：//arxiv.org/abs/2108.06932 https://arxiv.org/abs/2108.06932

刘媛媛，周小康，王跃勇，等 . 改进U-Net模型的保护性耕作田间秸秆覆盖检测［J］. 光学精密工程， 2022 ， 30 （ 9 ）： 1101 - 1112 . doi: 10.37188/OPE.20223009.1101 http://dx.doi.org/10.37188/OPE.20223009.1101

LIU Y Y ， ZHOU X K ， WANG Y Y ， et al . Straw coverage detection of conservation tillage farmland based on improved U-Net model ［J］. Opt. Precision Eng. ， 2022 ， 30 （ 9 ）： 1101 - 1112 . （in Chinese） . doi: 10.37188/OPE.20223009.1101 http://dx.doi.org/10.37188/OPE.20223009.1101

ZHANG W ， FU C ， ZHENG Y ， et al . HSNet： a hybrid semantic network for polyp segmentation ［J］. Computers in Biology and Medicine ， 2022 ， 150 ： 106173 . doi: 10.1016/j.compbiomed.2022.106173 http://dx.doi.org/10.1016/j.compbiomed.2022.106173

BERNAL J ， SÁNCHEZ FJ ， FERNÁNDEZ-ESPARRACH G ， et al . WM-DOVA maps for accurate polyp highlighting in colonoscopy： validation vs . saliency maps from physicians ［J］. Computerized Medical Imaging and Graphics ， 2015 ， 43 ： 99 - 111 . doi: 10.1016/j.compmedimag.2015.02.007 http://dx.doi.org/10.1016/j.compmedimag.2015.02.007

JHA D ， SMEDSRUD P H ， RIEGLER M A ， et al . Kvasir-SEG ： a Segmented Polyp Dataset ［M］. MultiMedia Modeling . Cham ： Springer International Publishing ， 2019 ： 451 - 462 . doi: 10.1007/978-3-030-37734-2_37 http://dx.doi.org/10.1007/978-3-030-37734-2_37

TAJBAKHSH N ， GURUDU S R ， LIANG J M . Automated polyp detection in colonoscopy videos using shape and context information ［J］. IEEE Transactions on Medical Imaging ， 2016 ， 35 （ 2 ）： 630 - 644 . doi: 10.1109/tmi.2015.2487997 http://dx.doi.org/10.1109/tmi.2015.2487997

SILVA J ， HISTACE A ， ROMAIN O ， et al . Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer ［J］. International Journal of Computer Assisted Radiology and Surgery ， 2014 ， 9 （ 2 ）： 283 - 293 . doi: 10.1007/s11548-013-0926-3 http://dx.doi.org/10.1007/s11548-013-0926-3

PATEL K ， BUR A M ， WANG G H . Enhanced U-Net： a feature enhancement network for polyp segmentation ［C］. 2021 18th Conference on Robots and Vision （CRV） . 26 - 28 ， 2021， Burnaby， BC， Canada. IEEE ， 2021： 181 - 188 . doi: 10.1109/crv52889.2021.00032 http://dx.doi.org/10.1109/crv52889.2021.00032

浏览量

1157

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于知识蒸馏的Transformer视觉跟踪器

渐进式CNN-Transformer语义补偿息肉分割网络

联合图像层级特征的压缩感知迭代重构

级联残差优化Transformer网络的图像超分辨率重建

CNN-Transformer结合对比学习的高光谱与LiDAR数据协同分类