基于分层识别的快速说话人识别研究

计算机工程与科学

基于分层识别的快速说话人识别研究

茅正冲，涂文辉

（江南大学轻工过程先进控制教育部重点实验室，江苏无锡 214122）

收稿日期:2016-12-12 修回日期:2017-03-13 出版日期:2018-07-25 发布日期:2018-07-25
基金资助:
国家自然科学基金(60973095)；江苏省自然科学基金(BK20131107)

Fast speaker recognition based on hierarchical recognition

MAO Zhengchong，TU Wenhui

（Key Laboratory of Advanced Process Control for Light Industry,

Ministry of Education,Jiangnan University，Wuxi 214122，China）

Received:2016-12-12 Revised:2017-03-13 Online:2018-07-25 Published:2018-07-25

摘要/Abstract

摘要：

随着说话人模型数量的增加，说话人识别系统的识别速度下降，不能满足实时性要求。针对这个问题，提出了基于分层识别模型的快速说话人识别方法。将变分法求解的KL散度的近似值作为模型间的相似性度量准则，并设计了说话人模型聚类的方法。结果表明，本文方法能够保证说话人模型聚类结果的有效性，在系统识别率损失很小的情况下，使系统的识别速度得到大幅度提升。

关键词: 高斯混合模型, 说话人识别, KL散度, 模型聚类

Abstract:

As the number of speaker models increases, the recognition speed of the speaker recognition system decreases, thus it cannot meet realtime requirement. To solve this problem, we propose a fast speaker recognition method based on hierarchical recognition model. The approximate value of the KL divergence solved by the variational method is used as the similarity measure between speaker models and a speaker model clustering method is designed. Experimental results show that the proposed method can ensure the validity of speaker model clustering results and improve the recognition speed of the system greatly while maintaining a small system recognition rate loss.

Key words: Gauss mixture model, speaker recognition, KL divergence, model clustering

茅正冲，涂文辉. 基于分层识别的快速说话人识别研究[J]. 计算机工程与科学.

MAO Zhengchong，TU Wenhui. Fast speaker recognition based on hierarchical recognition[J]. Computer Engineering & Science.

[1]	郝占军, 乔志强, 党小超, 段渝. 一种基于CSI的高鲁棒性步态识别方法[J]. 计算机工程与科学, 2022, 44(07): 1302-1312.
[2]	范韫. 基于EMD距离的稀疏自编码器[J]. 计算机工程与科学, 2022, 44(05): 894-900.
[3]	刘云, 王梓宇. k最近邻流序列算法对异常流检测的优化研究[J]. 计算机工程与科学, 2021, 43(06): 1060-1066.
[4]	刘云, 王梓宇. 无偏KL散度算法对时空异常区间检测的优化研究[J]. 计算机工程与科学, 2020, 42(07): 1318-1324.
[5]	陈沅涛1,2，刘煊赫1,2. 混合空间新型贝叶斯网络模型的图像分割应用研究[J]. 计算机工程与科学, 2017, 39(11): 2066-2073.
[6]	欧书华，刘学军，张礼. 基于KL散度的RNA-Seq数据差异异构体比例检测[J]. 计算机工程与科学, 2017, 39(01): 158-164.
[7]	陈平华1,王旭彬1,洪英汉2. 基于多项式有限混合模型的Slope One算法改进[J]. J4, 2016, 38(04): 761-767.
[8]	胡志立，郭敏. 基于四叉树分解与图割的彩色图像快速分割[J]. J4, 2015, 37(02): 390-396.
[9]	何俊1,贺前华2,张清华1,孙国玺1,肖明1,左敬龙1. 基于共同向量的非常态语音说话人识别算法[J]. J4, 2014, 36(08): 1599-1603.
[10]	刘军，梁久祯，柴志雷. 基于DM642的运动目标检测[J]. J4, 2013, 35(1): 107-112.
[11]	方志刚[1] 鲍福良[1] 叶伟中[2]. 多通道生物特征认证融合算法[J]. J4, 2007, 29(11): 72-75.