• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2014, Vol. 36 ›› Issue (10): 2009-2013.

• 论文 • 上一篇    下一篇

一种语料缺乏条件下的藏语音素自动切分方法

李冠宇,于洪志,吴志强   

  1. (西北民族大学中国民族语言文字教育部重点实验室,甘肃 兰州 730030)
  • 收稿日期:2014-06-19 修回日期:2014-07-18 出版日期:2014-10-25 发布日期:2014-10-25
  • 基金资助:

    国家自然科学基金资助项目(61262054)

An automatic phoneme segmentation method in continuous
Tibetan language under the condition of resourcedeficiency         

LI Guanyu,YU Hongzhi,WU Zhiqiang   

  1. (Key Laboratory for Chinese Ethnic Minority Language of Ministry of Education,
    Northwest University for Nationalities,Lanzhou 730030,China)
  • Received:2014-06-19 Revised:2014-07-18 Online:2014-10-25 Published:2014-10-25

摘要:

藏语语音合成及语音学研究中,经常需要切分音素。人工切分费时费力,但是由于藏语语料缺乏,训练的藏语声学模型不够精确和鲁棒,自动切分的音素边界不够准确。以藏语拉萨方言为研究对象,在确定拉萨方言音素集、建立拉萨方言发音词典的基础上,通过计算音素模型间的距离,确定了拉萨方言和英语的共同音素,融合拉萨方言和英语GMMHMM模型,并自动判断语音中的静音和短时停顿,构造语音对应的词网络,查询发音词典,将词网络扩展为模型(音素)网络,使用Viterbi算法将每一帧特征参数对应到模型的每一个状态上,进而对音素进行切分。实验表明,切分效果要优于单纯的藏语模型方法。

关键词: 藏语, 拉萨方言, 自动音素切分, 维特比算法, 隐马尔可夫模型

Abstract:

Phoneme segmentation is often necessary in research of Tibetan TTS or phonetics.Artificial segmentation is a hard job and timeconsuming.The acoustic model of Tibetan language is not precise or robust enough because of resourcedeficiency.Therefore, it is not precise enough when the method of autosegmentation is adopted.Lhasa dialect of Tibetan is chosen as the study object.Phone set and dictionary of Tibetan are established.Common phones are obtained on the basis of distance between phone models. GMMHMM models of English and Lhasa Tibetan are fused.Silences and short pauses are autojudged.Words network is established and then expanded to be a models (or monophones) network.All frames of parameters are segmented and aligned to sates of models by using Viterbi algorithm.Experiments demonstrate that phones are segmented and the result is better than the method of using pure Tibetan models.

Key words: Tibetan;Lhasa dialect;automatic phoneme segmentation;Viterbi;HMM