• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (12): 2155-2164.

• 软件工程 • 上一篇    下一篇

基于领域知识的语音识别鲁棒性增强技术研究

王斐斐,贲可荣,张献   

  1. (海军工程大学电子工程学院,湖北 武汉 430032)
  • 收稿日期:2022-08-10 修回日期:2022-10-08 接受日期:2023-12-25 出版日期:2023-12-25 发布日期:2023-12-14

Research on robust speech recognition technology based on domain knowledge

WANG Fei-fei,BEN Ke-rong,ZHANG Xian   

  1. (College of Electronic Engineering,Navy University of Engineering,Wuhan 430032,China)
  • Received:2022-08-10 Revised:2022-10-08 Accepted:2023-12-25 Online:2023-12-25 Published:2023-12-14

摘要: 针对语音识别软件在有噪声干扰时识别准确率降低的问题,为确保使用语音控制操作的安全性,提出一种基于领域知识的语音识别鲁棒性增强方法。以舰艇操控为应用背景,建立舰艇操控领域知识图谱;从航海图书资料和经典海战影视资料中提取舰艇操控指令,构建舰艇操控指令中文语音数据集;提出一种嵌入领域知识的解码方法,通过计算识别结果与领域知识图谱的匹配度对输出控制指令进行修正。实验结果表明,相较于目前流行的连接时序分类解码方法和基于注意力机制的解码方法,所提解码方法在识别信噪比为10 dB和20 dB的带噪语音时字错误率分别下降了4.0%和1.5%,指令识别准确率分别提升了10.3%和6.3%,提高了语音识别模型识别中文指令的鲁棒性。 

关键词: 语音识别, 知识图谱, 舰艇操控, 鲁棒性

Abstract: Due to the decrease in accuracy of speech recognition software in noisy environments, a robust enhancement method based on domain knowledge is proposed to ensure the safety of using speech control operations. Taking ship control as the application background, a domain knowledge graph is established for ship control. Ship control commands are extracted from nautical books and classic naval warfare film and television materials, and a Chinese speech dataset for ship control commands is constructed. A domain knowledge-embedded decoding method is proposed to correct the output control commands by calculating the matching degree between the recognition result and the domain knowledge graph. Experimental results show that compared with the current popular connection time sequence classification decoding method and attention mechanism-based decoding method, the proposed decoding method reduces the word error rate by 4.0% and 1.5% when recognizing noisy speech with a signal-to-noise ratio of 10dB and 20dB, respectively, and improves the accuracy of command recognition by 10.3% and 6.3%, respectively, improving the robustness of the speech recognition model in recognizing Chinese commands.

Key words: speech recognition, knowledge graph, ship control, robustness