Robust speech recognition based on adaptive deep neural network in complex environment

Computer Engineering & Science ›› 2022, Vol. 44 ›› Issue (06): 1105-1113.

• Artificial Intelligence and Data Mining • Previous Articles Next Articles

Robust speech recognition based on adaptive deep neural network in complex environment

ZHANG Kai-sheng,ZHAO Xiao-fen

（School of Electrical and Control Engineering,Shaanxi University of Science and Technology,Xi’an 710021,China）

Received:2020-05-22 Revised:2020-12-28 Accepted:2022-06-25 Online:2022-06-25 Published:2022-06-17

Abstract

Abstract: In a continuous speech recognition system, aiming at the complex environments (including the variability of speakers and environmental noise), the training data does not match the test data, which results in a low voice recognition rate. A speech recognition method based on adaptive deep neural network is studied. The improved regularized adaptive criterion and the adaptive deep neural network in the feature space are combined to improve data matching. The fusion of speaker identity vector i-vector and noise perception training are used to overcome speaker and environmental noise changes and improve the classification function of the output layer of the traditional deep neural network, which ensures the characteristics of compactness within the class and separation between classes. The test experiment was carried out by superimposing various background noises under the TIMIT English speech data set and the Microsoft Chinese speech data set. The results show that, compared with the current popular GMM-HMM and traditional DNN speech acoustic models, our proposal decreases the recognition word error rate by 5.151% and 3.113% respectively, which improves the generalization performance and robustness of the model to a certain extent.

Key words: speech recognition, deep neural network, improved adaptive criterion, feature space

ZHANG Kai-sheng, ZHAO Xiao-fen. Robust speech recognition based on adaptive deep neural network in complex environment[J]. Computer Engineering & Science, 2022, 44(06): 1105-1113.

[1]	MAO Run-ze, WU Zi-heng, XU Jia-yang, ZHANG Yan, CHEN Zhi, . DeepFlame: An open-source platform for reacting flow simulations empowered by deep learning and high-performance computing [J]. Computer Engineering & Science, 2024, 46(11): 1901-1907.
[2]	WANG Peng, ZHANG Jia-cheng, FAN Yu-yang, . A neural network pruning and quantization algorithm for hardware deployment [J]. Computer Engineering & Science, 2024, 46(09): 1547-1553.
[3]	LI Meng, LIU Zi-yi, SONG Yu-hang. A deep subspace clustering algorithm based on dual self-expression and the maximum entropy principle [J]. Computer Engineering & Science, 2024, 46(09): 1685-1692.
[4]	XIN Gao-feng, LIU Yu-xiao, ZHANG Qing-long, HAN Rui, LIU Chi. Block-grained domain adaptation for neural networks at edge [J]. Computer Engineering & Science, 2024, 46(08): 1361-1371.
[5]	JIANG Jing-fei, HE Yuan-hong, XU Jin-wei, XU Shi-yao, QIAN Xi-fu. NM-SpMM:A semi-structured sparse matrix multiplication algorithm for domestic heterogeneous vector processors [J]. Computer Engineering & Science, 2024, 46(07): 1141-1150.
[6]	WU Xia, ZHENG Hong-ying, XIAO Di. A dual-verification model watermarking scheme based on certification files [J]. Computer Engineering & Science, 2024, 46(04): 647-656.
[7]	WANG Fei-fei, BEN Ke-rong, ZHANG Xian. Research on robust speech recognition technology based on domain knowledge [J]. Computer Engineering & Science, 2023, 45(12): 2155-2164.
[8]	CAO Jian, CHEN Yi-mei, LI Hai-sheng, CAI Qiang, . A survey of pedestrian trajectory prediction based on graph neural network [J]. Computer Engineering & Science, 2023, 45(06): 1040-1053.
[9]	CHEN Xin-hai, LIU Jie, WAN Qian, GONG Chun-ye, . An improved method for solving partial differential equations using deep neural networks [J]. Computer Engineering & Science, 2022, 44(11): 1932-1940.
[10]	MA Ming-yuan, LI Hu, WANG Zi-bin, KUANG Xiao-hui. A survey of backdoor implantation and detection techniques on deep neural network model [J]. Computer Engineering & Science, 2022, 44(11): 1959-1968.
[11]	JIAN Jie, LUO Zhang, LAI Ming-che, XIAO Li-quan, XU Wei-xia. An adaptive high-speed channel equalizer based on deep neural network [J]. Computer Engineering & Science, 2022, 44(04): 605-610.
[12]	DU Peng, LI Chao, SHI Jian-ping, JIANG Lin. A deep neural network model compression method based on Adams shortcut connection [J]. Computer Engineering & Science, 2021, 43(11): 2043-2048.
[13]	HE Jing, LI Jin-wen, YANG An-yi. High speed channel modeling based on machine learning [J]. Computer Engineering & Science, 2021, 43(06): 984-988.
[14]	ZHANG Li-zhi, RAN Zhe-jiang, LAI Zhi-quan, LIU Feng. Performance analysis of distributed deep learning communication architecture [J]. Computer Engineering & Science, 2021, 43(03): 416-425.
[15]	TANG Zuo-dong,GONG Xiao-feng,LUO Rui-sen. A signal modulation recognition method based on wavelet feature and depth neural network [J]. Computer Engineering & Science, 2020, 42(05): 902-909.

Robust speech recognition based on adaptive deep neural network in complex environment

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments