• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2014, Vol. 36 ›› Issue (08): 1604-1608.

• 论文 • 上一篇    下一篇

基于最大似然线性回归的随机段模型说话人自适应研究

晁浩1,2,杨占磊2,刘文举2   

  1. (1.河南理工大学计算机科学与技术学院,河南 焦作 454000;
    2.中国科学院自动化研究所模式识别国家重点实验室,北京100190)
  • 收稿日期:2012-12-19 修回日期:2013-04-03 出版日期:2014-08-25 发布日期:2014-08-25
  • 基金资助:

    国家自然科学基金资助项目(91120303,90820303,90820011);国家973计划资助项目(2004CB318105);国家863计划资助项目(20060101Z4073,2006AA01Z194)

Research of speaker adaptation of stochastic
segment models using maximum likelihood linear regression         

CHAO Hao1,2,YANG Zhanlei2,LIU Wenju2   

  1. (1.School of Computer Science and Technology,Henan Polytechnic University,Jiaozuo 454000;2.National Laboratory of Pattern Recognition,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China)
  • Received:2012-12-19 Revised:2013-04-03 Online:2014-08-25 Published:2014-08-25

摘要:

提出了一种随机段模型系统的说话人自适应方法。根据随机段模型的模型特性,将最大似然线性回归方法引入到随机段模型系统中。在“863test”测试集上进行的汉语连续语音识别实验显示,在不同的解码速度下,说话人自适应后汉字错误率均有明显的下降。实验结果表明,最大似然线性回归方法在随机段模型系统中同样能取得较好的效果。

关键词: 语音识别, 说话人自适应, 最大似然线性回归, 随机段模型

Abstract:

A speaker adaptation method of Stochastic Segment Model (SSM) is proposed. According to the SSM’s characteristics, the theory of Maximum Likelihood Linear Regression (MLLR) method is introduced into the SSMbased systems. Continuous Chinese speech recognition experiment on "863test" test suite shows that the proposed method makes the error rate of Chinese characters decrease obviously under different decoding speeds. Experiment results indicate that the proposal can also improve the recognition performance on the SSMbased systems.

Key words: speech recognition;speaker adaptation;maximum likelihood linear regression;stochastic segment model