一种语料缺乏条件下的藏语音素自动切分方法

J4 ›› 2014, Vol. 36 ›› Issue (10): 2009-2013.

一种语料缺乏条件下的藏语音素自动切分方法

李冠宇，于洪志，吴志强

(西北民族大学中国民族语言文字教育部重点实验室，甘肃兰州 730030)

收稿日期:2014-06-19 修回日期:2014-07-18 出版日期:2014-10-25 发布日期:2014-10-25
基金资助:
国家自然科学基金资助项目(61262054)

An automatic phoneme segmentation method in continuous
Tibetan language under the condition of resourcedeficiency

LI Guanyu,YU Hongzhi,WU Zhiqiang

(Key Laboratory for Chinese Ethnic Minority Language of Ministry of Education,
Northwest University for Nationalities,Lanzhou 730030,China)

Received:2014-06-19 Revised:2014-07-18 Online:2014-10-25 Published:2014-10-25

摘要/Abstract

摘要：

藏语语音合成及语音学研究中，经常需要切分音素。人工切分费时费力，但是由于藏语语料缺乏，训练的藏语声学模型不够精确和鲁棒，自动切分的音素边界不够准确。以藏语拉萨方言为研究对象，在确定拉萨方言音素集、建立拉萨方言发音词典的基础上，通过计算音素模型间的距离，确定了拉萨方言和英语的共同音素，融合拉萨方言和英语GMMHMM模型，并自动判断语音中的静音和短时停顿，构造语音对应的词网络，查询发音词典，将词网络扩展为模型（音素）网络，使用Viterbi算法将每一帧特征参数对应到模型的每一个状态上，进而对音素进行切分。实验表明，切分效果要优于单纯的藏语模型方法。

关键词: 藏语, 拉萨方言, 自动音素切分, 维特比算法, 隐马尔可夫模型

Abstract:

Phoneme segmentation is often necessary in research of Tibetan TTS or phonetics.Artificial segmentation is a hard job and timeconsuming.The acoustic model of Tibetan language is not precise or robust enough because of resourcedeficiency.Therefore, it is not precise enough when the method of autosegmentation is adopted.Lhasa dialect of Tibetan is chosen as the study object.Phone set and dictionary of Tibetan are established.Common phones are obtained on the basis of distance between phone models. GMMHMM models of English and Lhasa Tibetan are fused.Silences and short pauses are autojudged.Words network is established and then expanded to be a models (or monophones) network.All frames of parameters are segmented and aligned to sates of models by using Viterbi algorithm.Experiments demonstrate that phones are segmented and the result is better than the method of using pure Tibetan models.

Key words: Tibetan;Lhasa dialect;automatic phoneme segmentation;Viterbi;HMM

李冠宇，于洪志，吴志强. 一种语料缺乏条件下的藏语音素自动切分方法[J]. J4, 2014, 36(10): 2009-2013.

LI Guanyu,YU Hongzhi,WU Zhiqiang. An automatic phoneme segmentation method in continuous
Tibetan language under the condition of resourcedeficiency [J]. J4, 2014, 36(10): 2009-2013.

[1]	王会举, 李孟萱, 黄卫卫, 周秋怡. 基于隐马尔可夫模型的多真值发现算法[J]. 计算机工程与科学, 2021, 43(03): 518-524.
[2]	夏吾吉1,2，华却才让1. 基于投射的藏语语义依存分析研究[J]. 计算机工程与科学, 2019, 41(10): 1868-1873.
[3]	周雁,西绕多吉. 面向藏语声纹识别的语料库建设[J]. 计算机工程与科学, 2018, 40(11): 2080-2084.
[4]	李娟1，张冰怡1，冯志勇1，徐超2，张铮3. 基于隐马尔可夫模型的视频异常场景检测[J]. 计算机工程与科学, 2017, 39(07): 1300-1308.
[5]	李冠宇，于洪志，李永宏，马宁. 基于决策树的藏语拉萨话三音子模型[J]. J4, 2013, 35(9): 146-150.
[6]	包亚萍,郑〓骏,武晓光. 基于HMM和遗传神经网络的语音识别系统[J]. J4, 2011, 33(4): 139-144.
[7]	孙一品钟求喜苏金树. 基于隐马尔可夫模型的攻击意图识别技术研究[J]. J4, 2007, 29(8): 19-22.
[8]	吴君浩[1] 骆嘉伟[1] 王艳[1] 杨涛[1] 杨旭[2]. 基于隐马尔可夫模型的二次k-均值基因序列聚类算法[J]. J4, 2007, 29(3): 54-56.
[9]	施德明林洋港陈恩红. 一种集成NER的文本分类特征选择方法[J]. J4, 2007, 29(11): 152-156.
[10]	张晓艳王挺陈火旺. 基于混合统计模型的汉语命名实体识别方法[J]. J4, 2006, 28(6): 135-139.
[11]	卢正鼎董泽锋. 文法推断与HMM相结合的信息提取[J]. J4, 2005, 27(8): 1-3.