采用STRAIGHT模型和深度信念网络的语音转换方法

计算机工程与科学

• 论文 • 上一篇

采用STRAIGHT模型和深度信念网络的语音转换方法

王民，苏利博，王稚慧，要趁红

（西安建筑科技大学信息与控制工程学院，陕西西安 710055）

收稿日期:2015-05-25 修回日期:2015-10-20 出版日期:2016-09-25 发布日期:2016-09-25
基金资助:
住房城乡建设部科学技术项目计划（2016-R2-045）;西安市碑林区2014 年科技计划项目（GX1412）

Voice conversion using STRAIGHT model and deep belief networks

WANG Min,SU Li-bo,WANG Zhi-hui,YAO Chen-hong

（School of Information and Control Engineering,Xi’an University of Architecture and Technology,Xi’an 710055, China）

Received:2015-05-25 Revised:2015-10-20 Online:2016-09-25 Published:2016-09-25

摘要/Abstract

摘要：

提出一种将STRAIGHT 模型和深度信念网络DBN相结合实现语音转换的方式。首先,通过STRAIGHT 模型提取出源说话人和目标说话人的语音频谱参数，用提取的频谱参数分别训练两个DBN 得到语音高阶空间的个性特征信息；然后，用人工神经网络ANN将两个具有高阶特征的空间连接并进行特征转换；最后,用基于目标说话人数据训练出的DBN 来对转换后的特征信息进行逆处理得到语音频谱参数，并用STRAIGHT 模型合成具有目标说话人个性化特征的语音。实验结果表明，采用此种方式获得的语音转换效果要比传统的采用GMM 实现语音转换更好，转换后的语音音质和相似度与目标语音更接近。

关键词: 语音转换, STRAIGHT 模型, 深度信念网络, 高阶空间

Abstract:

We propose a new voice conversion method which combines the STRAIGHT model with deep belief networks. Firstly, we utilize the STRAIGHT model to extract the speech spectrum parameters of the source speaker and target speaker which are then used to train the two DBN spectrum parameters, and obtain the voice characteristic information of the higher order space. Secondly, we can connect and convert the two high order spaces using the artificial neural networks (ANNs). Finally, we employ the DBN trained by the target speaker data to perform reverse processing on the converted feature information, thus obtaining voice spectral parameters. Voice that has personalized features of the target speaker is synthesized by the STRSIGHT model. Experimental results show that compared with the traditional GMM based voice conversion method, the converted voice quality and voice similarity of the proposed method are closer to the target voice.

Key words: voice conversion；STRAIGHT model；deep belief networks, high-order spaces

王民，苏利博，王稚慧，要趁红. 采用STRAIGHT模型和深度信念网络的语音转换方法[J]. 计算机工程与科学.

WANG Min,SU Li-bo,WANG Zhi-hui,YAO Chen-hong. Voice conversion using STRAIGHT model and deep belief networks [J]. Computer Engineering & Science.

[1]	王伟喆, 郭威彤, 杨鸿武, . 手语到情感语音的转换[J]. 计算机工程与科学, 2022, 44(10): 1869-1876.
[2]	张筱，张巍，王文浩，万永菁. 基于多谱特征生成对抗网络的语音转换算法[J]. 计算机工程与科学, 2020, 42(5): 893-901.
[3]	王民,杨秀峰,要趁红. 基于PSO优化GRNN的语音转换方法[J]. 计算机工程与科学, 2018, 40(4): 752-756.
[4]	田进，陈秀宏，傅俊鹏，徐德荣. 基于重叠稀疏组深度信念网络的图像识别[J]. 计算机工程与科学, 2018, 40(3): 515-524.

采用STRAIGHT模型和深度信念网络的语音转换方法

Voice conversion using STRAIGHT model and deep belief networks

PDF

可视化

摘要/Abstract

引用本文

使用本文

相关文章 4

编辑推荐

Metrics

本文评价