• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (05): 937-944.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

An unsupervised phoneme segmentation method for Lao language with multi-feature interaction fusion

LI Xin-jie1,2,WANG Wen-jun1,2,DONG Ling1,2,LAI Hua1,2,YU Zheng-tao1,2,GAO Sheng-xiang1,2   

  1. (1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500;
    2.Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China)
  • Received:2023-09-04 Revised:2023-10-20 Accepted:2024-05-25 Online:2024-05-25 Published:2024-05-30

Abstract: Aiming at the inaccurate phoneme segmentation problem caused by the lack of consideration of Lao language tone changes and audio diversity in existing methods, this paper proposes an unsupervised phoneme segmentation method for Lao language with multi-feature interaction fusion. Firstly, self-supervised features, spectral features and pitch features are independently coded to avoid the insufficiency of a single feature. Secondly, multiple independent features are gradually fused based on the attention mechanism, so that the model can more comprehensively capture the information of Lao language tone changes and phoneme boundaries. Finally, a learnable framework is adopted to optimize the phoneme segmentation model. The experimental results show that the proposed method improves the R-value by 27.88% on the Lao phoneme segmentation task compared with the baseline methods.


Key words: unsupervised learning, feature fusion, Lao language, phoneme segmentation, speech representation