• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Corpus construction for Tibetan voiceprint recognition

ZHOU Yan,Shereb Dorje   

  1. (Research Center of Tibetan Information Technology,Tibet University,Lhasa 850000,China)
  • Received:2017-06-15 Revised:2017-12-08 Online:2018-11-25 Published:2018-11-25

Abstract:

Research on Tibetan voiceprint recognition technology has just started, and it is an urgent and necessary task to establish a corpus. We design and build a corpus based on the characteristics of Tibetan language, which consists of two parts: textdependent part and textindependent part. Texts of the corpus are collected from a variety of materials, including newspaper, literature, education, science and technology, Buddhism, and history and traditional culture. As for the recording part, we invite 50 speakers from different regions of Tibet. The corpus contains 9500 speech files and it lays a certain foundation for Tibetan voiceprint recognition.

Key words: Tibetan, voiceprint recognition, corpus