• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (4): 677-685.

• 图形与图像 • 上一篇    下一篇

CT图像肾肿瘤分割的三维轴向Transformer模型

张金龙1,吴敏2,孙玉宝1


  

  1. (1.南京信息工程大学计算机学院、网络空间安全学院,江苏 南京 210044;2.东部战区总医院医学工程科,江苏 南京 210018)

  • 收稿日期:2023-10-30 修回日期:2024-04-01 出版日期:2025-04-25 发布日期:2025-04-17

3D axial Transformer model for kidney tumor segmentation in CT images

ZHANG Jinlong1,WU Min2,SUN Yubao1   

  1. (1.School of Computer Science,School of Cyber Science and Engineering,
    Nanjing University of Information Science & Technology,Nanjing 210044;
    (2.Department of Medical Engineering,Chinese PLA General Hospital of Eastern Theater Command,Nanjing 210018,China)
  • Received:2023-10-30 Revised:2024-04-01 Online:2025-04-25 Published:2025-04-17

摘要: 自动分割CT图像序列中肾脏及其肿瘤区域能够为放化疗计划提供定量参考依据。当前基于Transformer的肾肿瘤分割模型得到了广泛关注,特别是与U-Net模型及其变体结合使用。现有的基于Transformer的分割网络通常在单个切片局部窗口内进行特征学习,对切片内空间信息以及切片间轴向信息表示存在不足。针对这一问题,提出了三维轴向Transformer模块,将3个维度的复杂耦合关联分解为交替的2个轴向注意力,融合了切片内部以及切片之间的轴向体关联信息。以三维轴向Transformer模块为基础,融合多尺度特征与残差学习方式,构建了二阶段的肾脏肿瘤分割编解码网络ATrans  UNet,在KiTS19数据集上,肾脏和肾脏肿瘤分割结果的Dice相似性分别是96.43%和81.04%,平均Dice得分对比2D-Unet提升了8.40%,对比3D-Unet提升了4.84%。

关键词: CT图像序列, 肾肿瘤三维分割, 三维轴向Transformer , 二阶段编解码网络

Abstract: Automatic segmentation of kidneys and their tumor areas in CT image sequences can provide quantitative references for radiotherapy and chemotherapy planning.Currently,kidney tumor segmentation models based on Transformer have attracted widespread attention,especially when used in conjunction with the U-Net model and its variants.Existing Transformer-based segmentation networks typically learn features within local windows of individual slices,resulting in insufficient representation zof intra-slice spatial information and inter-slice axial information.To address this issue,a three- dimensional axial Transformer module is proposed,which decomposes the complex coupling of the three dimensions into alternating axial attentions,integrating both intra-slice and inter-slice axial correlation information.Based on the three-dimensional axial Transformer module,a two-stage kidney tumor segmentation encoder-decoder network,ATrans UNet (Axial Transformer UNet),incorporates multi-scale features and residual learning.On KiTS19 dataset,the Dice similarity coefficients for kidney and kidney tumor segmentation are 96.43% and 81.04%,respectively,representing an improvement of 8.40% over 2D-Unet and 4.84% over 3D-Unet in average Dice scores.

Key words: CT image sequences;3D segmentation of kidney tumors;3D axial Transformer, two-stage encoding-decoding network