西夏文字的多层掩码识别方法

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (12): 2227-2238.

西夏文字的多层掩码识别方法

马金林1,2,闫琦1,马自萍3

(1.北方民族大学计算机科学与工程学院，宁夏银川 750021；
2.图像图形智能信息处理国家民委重点实验室，宁夏银川 750021;3.北方民族大学数学与信息科学学院，宁夏银川 750021)

收稿日期:2023-11-21 修回日期:2024-03-19 出版日期:2024-12-25 发布日期:2024-12-23
基金资助:
国家自然科学基金（62462001）；宁夏自然科学基金（2023AAC03264）；北方民族大学中央高校基本科研业务费专项资金资助（2023ZRLG02）；宁夏高等学校科学研究项目（NYG2024066）

A multi-layer mask recognition method for Tangut characters

MA Jin-lin1,2,YAN Qi1,MA Zi-ping3

(1.College of Computer Science and Engineering,North Minzu University,Yinchuan 750021;
2.The Key Laboratory of Images and Graphics Intelligent Processing of
State Ethnic Affairs Commission,Yinchuan 750021;
3.School of Mathematics and Information Science,North Minzu University,Yinchuan 750021,China)

Received:2023-11-21 Revised:2024-03-19 Online:2024-12-25 Published:2024-12-23

摘要/Abstract

摘要： 针对现有方法对模糊、残缺西夏文字识别能力较差的问题，提出西夏文字识别模型MMSFTR。首先，提出多层掩码学习策略，分层次提取字符关键特征，帮助模型更有效地理解西夏文字内部结构，提高对复杂西夏文字的特征描述能力。其次，设计多尺度特征融合模块，以提取更丰富的多尺度特征。然后，提出通道自适应注意力模块，更好地选择和关注特定通道的信息，并设计掩码注意力模块改善模型感知能力。最后，设计特征增强模块，对网络进行多层次特征优化，并进行深层次特征增强。MMSFTR通过4个模块的协同作业，使得模型达到了预期效果。实验结果显示：MMSFTR在TCD-E数据集上达到99.40%的识别准确率，有效提升了对模糊、残缺西夏文字的识别效果。

关键词: 西夏文字识别, 多尺度特征融合, 掩码学习, 逆残差块

Abstract: Aiming at the problem of poor recognition ability of existing methods for fuzzy and mutilated Tangut characters, a Tangut character recognition model MMSFTR is proposed. Firstly, a multi-layer mask learning strategy is introduced to extract key character features in a hierarchical manner, assisting the model in understanding the internal structure of the Tangut characters more efficiently, and improving its ability to describe complex features of Tangut characters. Secondly, a multi-scale feature fusion module is designed to extract richer multi-scale features. Then, a channel adaptive attention module is proposed to better select and focus on information from specific channels. A mask attention module is also designed to improve the model's perception capabilities. Finally, a feature enhancement module is designed to optimize multi-level features of the network and enhance deep-level features. Through the collaborative work of these 4 modules, MMSFTR achieves the desired results. Experimental results show that MMSFTR achieves a recognition accuracy of 99.40% on the TCD-E dataset, effectively enhancing the recognition effect of fuzzy and mutilated Tangut characters.

Key words: Tangut character recognition, multi-scale feature fusion, mask learning, inverse residual block

马金林, 闫琦, 马自萍. 西夏文字的多层掩码识别方法[J]. 计算机工程与科学, 2024, 46(12): 2227-2238.

MA Jin-lin, YAN Qi, MA Zi-ping. A multi-layer mask recognition method for Tangut characters[J]. Computer Engineering & Science, 2024, 46(12): 2227-2238.

[1]	崔克彬, 崔叶微. 基于卷积和Transformer的断路器动触头跟踪方法研究[J]. 计算机工程与科学, 2023, 45(07): 1236-1244.
[2]	孙琪, 翟锐, 左方, 张玉涛, . 基于部分卷积和多尺度特征融合的人脸图像修复模型[J]. 计算机工程与科学, 2023, 45(02): 304-312.
[3]	李兰, 刘杰, 张洁. 基于YOLOv4改进算法的复杂行人检测模型研究[J]. 计算机工程与科学, 2022, 44(08): 1449-1456.
[4]	李利荣, 王子炎, 张开, 杨荻椿, 熊炜, 巩朋成, . 基于OSE-dResnet网络的列车底部零件检测算法[J]. 计算机工程与科学, 2022, 44(04): 692-698.
[5]	吴从中1,侯国松1,丁正龙2,许良凤1,詹曙1. 像素级的皮肤分割与面色分级[J]. 计算机工程与科学, 2019, 41(11): 1985-1990.