• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (08): 1425-1432.

• 图形与图像 • 上一篇    下一篇

基于加权非负矩阵分解的异常声音检测方法研究

潘雨青,于浩,李峰   

  1. (江苏大学计算机科学与通信工程学院,江苏 镇江 212013)
  • 收稿日期:2023-04-07 修回日期:2023-12-03 接受日期:2024-08-25 出版日期:2024-08-25 发布日期:2024-09-02

An abnormal sound detection method based on weighted non-negative matrix decomposition

PAN Yu-qing,YU Hao,LI Feng   

  1. (School of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang 212013,China)
  • Received:2023-04-07 Revised:2023-12-03 Accepted:2024-08-25 Online:2024-08-25 Published:2024-09-02

摘要: 异常声音检测方法多用强标签数据进行训练,而高质量的强标签音频数据标注难度较大、收集成本高昂。针对现有异常音频检测方法使用弱标签数据会受到非平稳和时变噪声的干扰,导致训练结果较差、准确率低的问题,提出一种基于音频频谱的加权非负矩阵分解WNMF方法。该方法使用WNMF对弱标签和无标签数据进行标记,并分离目标声音事件和背景噪声。在适当的权值下,WNMF改变标记时不同频段音频信息的重要性,以抑制噪声,提高分离质量,使其逼近全监督模型训练的效果;之后使用卷积神经网络产生帧级预测和音频标签预测。仿真实验结果表明,该方法的准确率相比于传统NMF处理弱标签数据的方法提升了4.8%。

关键词: 异常声音检测, 弱标签和无标签数据, 加权非负矩阵分解, 卷积神经网络

Abstract: Existing abnormal sound detection methods often rely on strongly labeled data for training, but high-quality strongly labeled audio data is difficult to annotate and costly to collect. Addressing the issues of poor training results and low accuracy caused by interference from non-stationary and time-varying noise when using weakly labeled data in current abnormal audio detection methods, a weighted non-negative matrix factorization (WNMF) method based on audio spectrum is proposed. This method utilizes WNMF to label weakly labeled and unlabeled data, and separates target sound events from background noise. Under appropriate weight values, WNMF alters the importance of audio information in different frequency bands during labeling to suppress noise and improve separation quality, approaching the effect of fully supervised model training. Then, a convolutional neural network is used to generate frame-level predictions and audio label predictions. Simulation experiments show that this method improves the accuracy by 4.8% compared to traditional NMF methods for processing weakly labeled data.

Key words: abnormal sound detection, weakly labeled and unlabeled data, weighted non-negative matrix factorization, convolutional neural networks