A low-resource Lao text regularization task based on BiLSTM

Computer Engineering & Science ›› 2023, Vol. 45 ›› Issue (07): 1292-1299.

• Artificial Intelligence and Data Mining • Previous Articles Next Articles

A low-resource Lao text regularization task based on BiLSTM

WANG Jian1,JIANG Lin1,2,WANG Lin-qin1,2,YU Zheng-tao1,2,ZHANG Song1，2,GAO Sheng-xiang1,2

(1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500;
2.Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500，China)

Received:2021-11-24 Revised:2022-03-11 Accepted:2023-07-25 Online:2023-07-25 Published:2023-07-11

Abstract

Abstract: Text normalization (TN) is an indispensable work in the front-end analysis task of speech synthesis text. Lao text normalization is to convert non-standard words (NSW) in Lao text into spoken-form words (SFW). At present, the task of text normalization has not yet been carried out in Lao, which mainly faces the problems of difficult acquisition of training data, diversified language expression and text regularization with ambiguity. A text normalization task in Lao is carried out. This task is completed as a sequence tagging task, and neural networks are used to predict NSW with ambiguity in combination with context. The corpus of the Lao text normalization task is constructed, the model results is predicted through the neural network, the self-attention mechanism is increased to deepen the relationship between the sequence characters, and different strategies are explored to introduce the pre-trained language model. An accuracy of 67.59% is achieved on the test set.

Key words: Lao, text normalization, neural network, self-attention mechanism

WANG Jian, JIANG Lin, WANG Lin-qin, YU Zheng-tao, ZHANG Song, GAO Sheng-xiang, . A low-resource Lao text regularization task based on BiLSTM[J]. Computer Engineering & Science, 2023, 45(07): 1292-1299.

[1]	SHEN Fan-fan, TANG Xing-yi, ZHANG Jun, XU Chao, CHEN Yong, HE Yan-xiang. Malicious behavior detection method based on iFA and improved LSTM network [J]. Computer Engineering & Science, 2024, 46(12): 2158-2170.
[2]	MAO Run-ze, WU Zi-heng, XU Jia-yang, ZHANG Yan, CHEN Zhi, . DeepFlame: An open-source platform for reacting flow simulations empowered by deep learning and high-performance computing [J]. Computer Engineering & Science, 2024, 46(11): 1901-1907.
[3]	HUANG Shan, WU Yu-fan, L He-xuan, DUAN Xiao-dong, . A heterogeneous differential synchronous parallel training algorithm [J]. Computer Engineering & Science, 2024, 46(11): 1949-1959.
[4]	XU Xin, LI Ruo-shi, YUAN Ye, LIU Na. Semantic segmentation of foggy driving scenes based on learnable image filter [J]. Computer Engineering & Science, 2024, 46(11): 2027-2034.
[5]	FU Yan, YANG Xu, YE Ou. A smoke recognition method based on CNN and Transformer feature fusion [J]. Computer Engineering & Science, 2024, 46(11): 2045-2052.
[6]	CHEN Zi-xiong, CHEN Xu, JING Yong-jun, SONG Ji-fei. A survey of source code vulnerability detection research based on graph neural networks [J]. Computer Engineering & Science, 2024, 46(10): 1775-1792.
[7]	CHEN Chang-feng, ZHAO Hong-zhou, ZHOU Kai-qing. Code plagiarism detection based on graph neural network [J]. Computer Engineering & Science, 2024, 46(10): 1815-1824.
[8]	ZHANG Yue, ZHANG Lei, LIU Bai-long, LIANG Zhi-zhen, ZHANG Xue-fei. Multi-spatial scale traffic prediction model based on spatio-temporal Transformer [J]. Computer Engineering & Science, 2024, 46(10): 1852-1863.
[9]	WANG Peng, ZHANG Jia-cheng, FAN Yu-yang, . A neural network pruning and quantization algorithm for hardware deployment [J]. Computer Engineering & Science, 2024, 46(09): 1547-1553.
[10]	YUAN Jia-wei, ZHAO Jin. OMCI model similarity computation based on graph neural networks [J]. Computer Engineering & Science, 2024, 46(09): 1576-1586.
[11]	ZHOU Qi, ZHOU Ning-ning. A deep neural network-enhanced pairwise bilinear factorization machine model [J]. Computer Engineering & Science, 2024, 46(09): 1648-1659.
[12]	WU Si-qi, ZHAO Qing-hua, YU Yu-chen. Meta-learning based graph neural network cold start recommendation [J]. Computer Engineering & Science, 2024, 46(09): 1675-1684.
[13]	LI Meng, LIU Zi-yi, SONG Yu-hang. A deep subspace clustering algorithm based on dual self-expression and the maximum entropy principle [J]. Computer Engineering & Science, 2024, 46(09): 1685-1692.
[14]	LIU Guo-qi, HE Ting-nian, RONG Yi-xuan, LI Zhuo-ran . A point of interest recommendation model based on tracks and friend relationship of users [J]. Computer Engineering & Science, 2024, 46(09): 1693-1701.
[15]	HUANG Zhi-rui, JIA Xin-ru, , ZHU Hao-zhe, , CHEN Chi-xiao, . A low-power keyword spotting system with SRAM buffer and computing-in-memory [J]. Computer Engineering & Science, 2024, 46(08): 1331-1339.

A low-resource Lao text regularization task based on BiLSTM

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments