• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (8): 1459-1469.

• 图形与图像 • 上一篇    下一篇

动态空间Transformer与多级融合的视网膜病变分级算法

梁礼明,钟奕,康婷,金家新


  

  1. (江西理工大学电气工程与自动化学院,江西 赣州 341000)

  • 收稿日期:2024-05-29 修回日期:2024-10-01 出版日期:2025-08-25 发布日期:2025-08-27
  • 基金资助:
    国家自然科学基金(51365017,61463018);江西省自然科学基金(20192BAB205084);江西省教育厅科学技术研究青年项目(GJJ2200848)

Dynamic spatial Transformer and multi-level fusion algorithm for retinopathy grading

LIANG Liming,ZHONG Yi,KANG Ting,JIN Jiaxin   

  1. (School of Electrical Engineering and Automation,Jiangxi University of Science and Technology,Ganzhou 341000,China)

  • Received:2024-05-29 Revised:2024-10-01 Online:2025-08-25 Published:2025-08-27

摘要: 针对糖尿病视网膜病变图像存在误分级和病灶边缘信息关注较少的问题,提出一种动态空间Transformer与多级融合视网膜病变分级算法。该算法首先将视网膜图像经PVT v2主干网络实现对病灶信息的初步提取;其次在网络前3层引入轮廓增强模块,凸显病灶边缘特征,提高算法对病灶像素的定位感知能力;再次在网络底层设计动态空间注意力模块,有效联系全局和局部空间信息,以提升算法挖掘深层语义信息的能力;最后构建多级门控融合模块,实现非诊断信息的滤除,同时对可诊断信息进行多级融合,进一步提高视网膜病变分级准确率。在IDRID和APTOS 2019数据集上进行实验验证,其二次加权系数分别为91.71%和89.89%,IDRID数据集上准确率和APTOS 2019数据集ROC曲线下方面积的占比分别为79.61%和93.06%。实验结果表明,所提出算法在视网膜病变分级领域具有一定应用价值。

关键词: 视网膜病变分级, 动态空间注意力, 轮廓增强模块, 多级门控融合模块

Abstract: To address the issues of misgrading and insufficient focus on lesion edge information in diabetic retinopathy images,a retinopathy grading algorithm combining dynamic spatial Transformer and multi-level fusion is proposed.Firstly,the retinal images are processed through the PVT v2 backbone network for initial extraction of lesion information.Secondly,a contour enhancement module is introduced in the first three layers of the network to highlight lesion edge features,thereby improving the algorithm’s localization perception of lesion pixels.Thirdly,a dynamic spatial attention module is designed at the network’s lower layers to effectively connect global and local spatial information,enhancing the algorithm’s ability to extract deep semantic information.Finally,a multi-level gated fusion module is constructed to filter out non-diagnostic information while performing multi-level fusion of diagnostic information,further improving the accuracy of retinopathy grading.Experiments on IDRID and APTOS 2019 datasets show that the QWK are 91.71% and 89.89% respectively,the Acc on IDRID dataset and the AUC on APTOS 2019 dataset are 79.61% and 93.06% respectively.The experimental results demonstrate that the proposed algorithm has significant application value in the field of retinopathy grading.

Key words: retinopathy grading, dynamic spatial attention, contour enhancement module, multi-scale gated fusion module