To address the issue of inaccurate text region localization caused by diverse text forms and complex back-grounds in natural scenes, this paper proposes a text detection algorithm based on feature enhancement and adaptively multi-scale feature fusion. Firstly, the residual network is improved to reduce the loss of semantic information. Secondly, coordinate attention is embedded into the extracted features to suppress redundant background information and improve attention to text regions, thereby enhancing the ability to locate text boundaries. Thirdly, an adaptive multi-scale feature fusion module is incorporated to integrate learned spatial location weights into feature maps at different scales, enabling more comprehensive fusion of multi-scale feature information. Finally, a differentiable binarization algorithm is used to generate text detection results. To verify the effectiveness of the algorithm, experiments were conducted on the publicly available datasets ICDAR2015, MSRA-TD500, and Total Text, achieving comprehensive metric F1 -score of 88.1%, 87.7%, and 86.3%, respectively. The experimental results demonstrate that this algorithm exhibits good robustness and generalization in text detection.
LI Qiong, QI Changshi, XIE Kai
. Scene text detection based on feature enhancement and adaptively multi-scale feature fusion[J]. Computer Engineering & Science, 2026
, 48(4)
: 689
-698
.
DOI: 10.3969/j.issn.1007-130X.2026.04.013