• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (12): 2227-2238.

• 图形与图像 • 上一篇    下一篇

西夏文字的多层掩码识别方法

马金林1,2,闫琦1,马自萍3   

  1. (1.北方民族大学计算机科学与工程学院,宁夏 银川 750021;
    2.图像图形智能信息处理国家民委重点实验室,宁夏 银川 750021;3.北方民族大学数学与信息科学学院,宁夏 银川 750021)

  • 收稿日期:2023-11-21 修回日期:2024-03-19 接受日期:2024-12-25 出版日期:2024-12-25 发布日期:2024-12-23
  • 基金资助:
    国家自然科学基金(62462001);宁夏自然科学基金(2023AAC03264);北方民族大学中央高校基本科研业务费专项资金资助(2023ZRLG02);宁夏高等学校科学研究项目(NYG2024066)

A multi-layer mask recognition method for Tangut characters

MA Jin-lin1,2,YAN Qi1,MA Zi-ping3   

  1. (1.College of Computer Science and Engineering,North Minzu University,Yinchuan 750021;
    2.The Key Laboratory of Images and Graphics Intelligent Processing of 
    State Ethnic Affairs Commission,Yinchuan 750021;
    3.School of Mathematics and Information Science,North Minzu University,Yinchuan 750021,China)
  • Received:2023-11-21 Revised:2024-03-19 Accepted:2024-12-25 Online:2024-12-25 Published:2024-12-23

摘要: 针对现有方法对模糊、残缺西夏文字识别能力较差的问题,提出西夏文字识别模型MMSFTR。首先,提出多层掩码学习策略,分层次提取字符关键特征,帮助模型更有效地理解西夏文字内部结构,提高对复杂西夏文字的特征描述能力。其次,设计多尺度特征融合模块,以提取更丰富的多尺度特征。然后,提出通道自适应注意力模块,更好地选择和关注特定通道的信息,并设计掩码注意力模块改善模型感知能力。最后,设计特征增强模块,对网络进行多层次特征优化,并进行深层次特征增强。MMSFTR通过4个模块的协同作业,使得模型达到了预期效果。实验结果显示:MMSFTR在TCD-E数据集上达到99.40%的识别准确率,有效提升了对模糊、残缺西夏文字的识别效果。

关键词: 西夏文字识别, 多尺度特征融合, 掩码学习, 逆残差块

Abstract: Aiming at the problem of poor recognition ability of existing methods for fuzzy and mutilated Tangut characters, a Tangut character recognition model MMSFTR is proposed. Firstly, a multi-layer mask learning strategy is introduced to extract key character features in a hierarchical manner, assisting the model in understanding the internal structure of the Tangut characters more efficiently, and improving its ability to describe complex features of Tangut characters. Secondly, a multi-scale feature fusion module is designed to extract richer multi-scale features. Then, a channel adaptive attention module is proposed to better select and focus on information from specific channels. A mask attention module is also designed to improve the model's perception capabilities. Finally, a feature enhancement module is designed to optimize multi-level features of the network and enhance deep-level features. Through the collaborative work of these 4 modules, MMSFTR achieves the desired results. Experimental results show that MMSFTR achieves a recognition accuracy of 99.40% on the TCD-E dataset, effectively enhancing the recognition effect of fuzzy and mutilated Tangut characters. 

Key words: Tangut character recognition, multi-scale feature fusion, mask learning, inverse residual block