Computer Engineering & Science
Previous Articles
WANG Min,SU Li-bo,WANG Zhi-hui,YAO Chen-hong
Received:
Revised:
Online:
Published:
Abstract:
We propose a new voice conversion method which combines the STRAIGHT model with deep belief networks. Firstly, we utilize the STRAIGHT model to extract the speech spectrum parameters of the source speaker and target speaker which are then used to train the two DBN spectrum parameters, and obtain the voice characteristic information of the higher order space. Secondly, we can connect and convert the two high order spaces using the artificial neural networks (ANNs). Finally, we employ the DBN trained by the target speaker data to perform reverse processing on the converted feature information, thus obtaining voice spectral parameters. Voice that has personalized features of the target speaker is synthesized by the STRSIGHT model. Experimental results show that compared with the traditional GMM based voice conversion method, the converted voice quality and voice similarity of the proposed method are closer to the target voice.
Key words: voice conversion;STRAIGHT model;deep belief networks, high-order spaces
WANG Min,SU Li-bo,WANG Zhi-hui,YAO Chen-hong. Voice conversion using STRAIGHT model and deep belief networks [J]. Computer Engineering & Science.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2016/V38/I09/1950