基于加权非负矩阵分解的异常声音检测方法研究

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (08): 1425-1432.

基于加权非负矩阵分解的异常声音检测方法研究

潘雨青，于浩，李峰

(江苏大学计算机科学与通信工程学院，江苏镇江 212013)

收稿日期:2023-04-07 修回日期:2023-12-03 接受日期:2024-08-25 出版日期:2024-08-25 发布日期:2024-09-02

An abnormal sound detection method based on weighted non-negative matrix decomposition

PAN Yu-qing,YU Hao,LI Feng

(School of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang 212013,China)

Received:2023-04-07 Revised:2023-12-03 Accepted:2024-08-25 Online:2024-08-25 Published:2024-09-02

摘要/Abstract

摘要： 异常声音检测方法多用强标签数据进行训练，而高质量的强标签音频数据标注难度较大、收集成本高昂。针对现有异常音频检测方法使用弱标签数据会受到非平稳和时变噪声的干扰，导致训练结果较差、准确率低的问题，提出一种基于音频频谱的加权非负矩阵分解WNMF方法。该方法使用WNMF对弱标签和无标签数据进行标记，并分离目标声音事件和背景噪声。在适当的权值下，WNMF改变标记时不同频段音频信息的重要性，以抑制噪声，提高分离质量，使其逼近全监督模型训练的效果；之后使用卷积神经网络产生帧级预测和音频标签预测。仿真实验结果表明，该方法的准确率相比于传统NMF处理弱标签数据的方法提升了4.8%。

关键词: 异常声音检测, 弱标签和无标签数据, 加权非负矩阵分解, 卷积神经网络

Abstract: Existing abnormal sound detection methods often rely on strongly labeled data for training, but high-quality strongly labeled audio data is difficult to annotate and costly to collect. Addressing the issues of poor training results and low accuracy caused by interference from non-stationary and time-varying noise when using weakly labeled data in current abnormal audio detection methods, a weighted non-negative matrix factorization (WNMF) method based on audio spectrum is proposed. This method utilizes WNMF to label weakly labeled and unlabeled data, and separates target sound events from background noise. Under appropriate weight values, WNMF alters the importance of audio information in different frequency bands during labeling to suppress noise and improve separation quality, approaching the effect of fully supervised model training. Then, a convolutional neural network is used to generate frame-level predictions and audio label predictions. Simulation experiments show that this method improves the accuracy by 4.8% compared to traditional NMF methods for processing weakly labeled data.

Key words: abnormal sound detection, weakly labeled and unlabeled data, weighted non-negative matrix factorization, convolutional neural networks

潘雨青, 于浩, 李峰. 基于加权非负矩阵分解的异常声音检测方法研究[J]. 计算机工程与科学, 2024, 46(08): 1425-1432.

PAN Yu-qing, YU Hao, LI Feng. An abnormal sound detection method based on weighted non-negative matrix decomposition[J]. Computer Engineering & Science, 2024, 46(08): 1425-1432.

[1]	刘俊奇, 涂文轩, 祝恩. 图卷积神经网络综述[J]. 计算机工程与科学, 2023, 45(08): 1472-1481.
[2]	崔克彬, 崔叶微. 基于卷积和Transformer的断路器动触头跟踪方法研究[J]. 计算机工程与科学, 2023, 45(07): 1236-1244.
[3]	胡宗承, 段晓威, 周亚同, 何昊. 基于多模态融合的动态手势识别研究[J]. 计算机工程与科学, 2023, 45(04): 665-673.
[4]	苏赋, 罗海波. 改进Stacking集成学习的指纹识别算法[J]. 计算机工程与科学, 2022, 44(12): 2153-2161.
[5]	张睿萍, 宁芊, 雷印杰, 陈炳才. 基于改进Mask R-CNN的生活垃圾检测[J]. 计算机工程与科学, 2022, 44(11): 2003-2009.
[6]	张克双, 邬春学, 张生, 林晓. 基于U-Net改进的多尺度融合超声神经分割算法研究[J]. 计算机工程与科学, 2022, 44(09): 1676-1685.
[7]	仇静博, 燕雪峰, 汪俊, 郭延文, 魏明强, . 基于全卷积神经网络的单幅隧道图像裂纹提取算法[J]. 计算机工程与科学, 2022, 44(05): 845-854.
[8]	周忠宝, 朱文静, 王皓, 郭修远, 王立峰. 基于弹幕文本挖掘的社交媒体KOL研究[J]. 计算机工程与科学, 2022, 44(03): 521-529.