• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (06): 1088-1094.

• 人工智能与数据挖掘 • 上一篇    下一篇

低信噪比下联合训练生成对抗网络的语音分离

王涛,全海燕   

  1. (昆明理工大学信息工程与自动化学院,云南 昆明 650500)
  • 收稿日期:2019-12-02 修回日期:2020-06-16 接受日期:2021-06-25 出版日期:2021-06-25 发布日期:2021-06-22
  • 基金资助:
    国家自然科学基金(41364002)

Speech separation of cooperative training generative adversarial networks under low SNR

WANG Tao,QUAN Hai-yan   

  1. (Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)

  • Received:2019-12-02 Revised:2020-06-16 Accepted:2021-06-25 Online:2021-06-25 Published:2021-06-22

摘要: 提升低信噪比下的分离语音质量是语音分离技术研究的重点,而大多数语音分离方法在低信噪比下仍只对目标说话人的语音进行特征训练。针对目前方法的不足,提出了一种基于联合训练生成对抗网络GAN的混合语音分离方法。为避免复杂的声学特征提取,生成模型采用全卷积神经网络直接提取混合语音时域波形的高维特征,判别模型通过构建二分类卷积神经网络来学习干扰说话人的特征信息,继而使系统得到的分离信息来源不再单一。实验结果表明,所提方法在低信噪比下仍能更好地恢复高频成分的信息,在双说话人混合语音数据集上的分离性能要优于所对比的方法。


关键词: 低信噪比, GAN, 联合训练, 语音分离

Abstract: Improving the quality of separated speech under low SNR is the focus of speech separation technology research, while most methods still only train the features of the target speaker's speech under low SNR. Aiming at the shortcoming of current methods, a mixed speech separation method based on cooperative training generative adversarial networks(GAN) is proposed. In order to avoid the extraction of the complex acoustic feature, the generative model uses a fully convolutional neural network to directly extract the high-dimensional features of the time-domain waveform, and the discriminative model obtains the features of the interference speaker by constructing a binary classification convolution neural network. Then, the source of the separated information obtained by the system is no longer single. Experiments show that the proposed method can better recover the information of high-frequency components under low SNR, and the separation performance is better than that of the comparative methods on the two-speaker mixed speech data set.



Key words: low SNR, generative adversarial networks, cooperative training, speech separation ,