• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 • 上一篇    下一篇

基于PSO优化GRNN的语音转换方法

王民,杨秀峰,要趁红   

  1. (西安建筑科技大学信息与控制工程学院,陕西 西安 710055)
  • 收稿日期:2016-03-25 修回日期:2016-11-02 出版日期:2018-04-25 发布日期:2018-04-25
  • 基金资助:

    住房城乡建设部科学技术项目(2016R2045);西安市碑林区2014年科技计划(GX1412)

Voice conversion based on optimizing GRNN by PSO

WANG Min,YANG Xiufeng,YAO Chenhong   

  1. (School of Information and Control Engineering,Xi’an University of Architecture and Technology,Xi’an 710055,China)
  • Received:2016-03-25 Revised:2016-11-02 Online:2018-04-25 Published:2018-04-25

摘要:

提出了一种基于粒子群算法PSO优化广义回归神经网络GRNN模型的语音转换方法。首先,该方法利用训练语音的声道和激励源的个性化特征参数分别训练两个GRNN,得到GRNN的结构参数;然后,利用PSO对GRNN的结构参数进行优化,减少人为因素对转换结果的影响;最后,对语音的韵律特征、基音轮廓和能量分别进行了线性转换,使得转换后的语音包含更多源语音的个性化特征信息。主客观实验结果表明:与径向基神经网络RBF和GRNN相比,使用本文提出的转换模型获得的转换语音的自然度和似然度都得到了很大的提升,谱失真率明显降低并且更接近于目标语音。
 

关键词: 语音转换, 广义回归神经网络模型, 粒子群优化

Abstract:

The paper proposes a new voice conversion method based on using Particle Swarm Optimization (PSO)to optimize General Regression Neural Network (GRNN).Firstly, the method utilizes the characteristic parameters of the training speaker’s vocal tract and source excitation to train two GRNNs, and then obtains the structure parameters of GRNNs. Secondly, in order to reduce the adverse impact of artificial maninduced factors on conversion results,  PSO is used to optimize the parameters of the GRNN model. Finally, the pitch contour and the energy profile of prosodic features are linearly converted, thus making the converted voice contain more personalized feature information of source speaker.Experimental results show that,compared with the radial basis function neural network(RBF) and the GRNN based voice conversion methods,our method improves the naturalness and likelihood of the converted voices and evidently decreases the spectral distortion rate, so the converted voices are more closed to the target voices.

Key words: voice conversion, general regression neural network model, particle swarm optimization