Computer Engineering & Science ›› 2026, Vol. 48 ›› Issue (1): 108-118.
• Graphics and Images • Previous Articles Next Articles
LI Yan,FAN Xinyu,CHEN Qin
Received:
Revised:
Online:
Published:
Abstract: In recent years, Transformers have made remarkable progress in the field of image recognition, yet they still face challenges in pixel-level segmentation tasks, primarily due to their insufficiently explicit and effective handling of local deviations. To address this issue, this paper proposes a multi-path and multi-scale attention network, named DMANet. By integrating the strengths of convolutional neural network (CNN) and Transformers during the encoding phase, this network is capable of simultaneously capturing fine-grained local information and extensive global context from images, effectively enhancing feature extraction capabilities. The proposed interactive dual-branch structure enhances feature integration, improving the model's performance in dense prediction tasks. During the decoding phase, cross-layer feature fusion is implemented to enhance DMANet’s ability to recognize complex objects. DMANet has demonstrated its exceptional performance and broad applicability in complex land cover segmentation tasks through experiments on Potsdam, GID-15, and L8 SPARCS datasets.
Key words: Transformer structure, semantic segmentation, multi-path and multi-scale, convolutional neural network, land cover
LI Yan, FAN Xinyu, CHEN Qin. A multi-path and multi-scale attention network for land cover segmentation[J]. Computer Engineering & Science, 2026, 48(1): 108-118.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2026/V48/I1/108