• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (08): 1461-1469.

• 人工智能与数据挖掘 • 上一篇    下一篇

实际噪声下基于时序卷积网络的手机来源识别

吴张倩1,苏兆品1,2,3,4,武钦芳1,张国富1,2,3,4   

  1. (1.合肥工业大学计算机与信息学院,安徽 合肥 230601;2.智能互联系统安徽省实验室(合肥工业大学),安徽 合肥 230009;

    3.工业安全与应急技术安徽省重点实验室(合肥工业大学),安徽 合肥 230601;

    4.安全关键工业测控技术教育部工程研究中心,安徽 合肥 230601)
  • 收稿日期:2020-05-19 修回日期:2020-08-24 接受日期:2021-08-25 出版日期:2021-08-25 发布日期:2021-08-24
  • 基金资助:
    国家自然科学基金(61573125);安徽省重点研究与开发计划(202004d07020011,202104d07020001);中国工程院咨询研究重点项目(2020-XZ-3);教育部人文社会科学研究青年基金(19YJC870021,18YJC870025);中央高校基本科研业务费专项资金(PA2020GDKC0015,PA2019GDQT0008,PA2019GDPK0072)

Source cell-phone identification under practical noises based on temporal convolutional network

WU Zhang-qian1,SU Zhao-pin1,2,3,4,WU Qin-fang1,ZHANG Guo-fu1,2,3,4   

  1. (1.School of Computer Science and Information Engineering,Hefei University of Technology,Hefei  230601;

    2.Intelligent Interconnected Systems Laboratory of Anhui Province (Hefei University of Technology),Hefei 230009;

    3.Anhui Province Key Laboratory of Industry Safety and 
    Emergency Technology (Hefei University of Technology),Hefei 230601;

    4.Engineering Research Center of Safety Critical Industrial Measurement and Control Technology,
    Ministry of Education,Hefei 230601,China)

  • Received:2020-05-19 Revised:2020-08-24 Accepted:2021-08-25 Online:2021-08-25 Published:2021-08-24

摘要: 针对实际环境噪声下的手机来源识别问题,提出一种基于线性判别分析和时序卷积网络的手机来源识别方法。首先,通过分析不同手机语音特征在实际环境噪声下的分类性能,基于带能量描述符、常数Q变换域和线性判别分析得到一种新的手机语音混合特征。然后,以此混合特征为输入,基于时序卷积网络进行训练和分类。最后,在10个品牌、47种手机型号、32 900条语音样本的实际环境噪声语音库上的测试结果显示,所提方法的平均识别准确率达到99.82%。此外,与经典的基于带能量描述符和支持向量机的方法,以及基于常数Q变换域和卷积神经网络的方法相比,平均识别准确率分别提高了0.44和0.54个百分点,平均召回率分别提高了0.45和0.55个百分点,平均精确率分别提高了0.41和0.57个百分点,平均F1分数分别提高了0.49和0.55个百分点。实验结果表明,所提方法具有更优的综合识别性能。


关键词: 手机来源识别, 实际环境噪声, 混合特征, 线性判别分析, 时序卷积网络

Abstract: To solve the problem of source cell-phone identification under practical environmental noises, a source cell-phone identification method based on linear discriminant analysis and temporal convolutional network is proposed. Firstly, the classification performance of different speech features under practical noises is analyzed in detail, based on which a new mixed speech feature is proposed according to band energy descriptor, constant Q transform, and linear discriminant analysis. Additionally, the mixed speech feature is used as the input to the temporal convolutional network for training and classification. Finally, the test results on the practical noise speech database of 10 brands, 47 mobile phone models, and 32,900 speech samples show that the average recognition accuracy of the proposed method reaches 99.82%. Moreover, compared with the two existing classical methods based on the band energy descriptor and support vector machine, and the constant Q transform domain and convolutional neural network, the proposed method increases the average recognition accuracy by about 0.44 and 0.54 percentages respectively, the average recall by about 0.45 and 0.55 percentages respectively, the average precision by about 0.41 and 0.57 percentages respectively, and the average F1-score by about 0.49 and 0.55 percentages respectively. The experimental results show that the proposed method has better comprehensive identification performance.

Key words: source cell-phone identification, practical environmental noise, mixed feature, linear discriminant analysis, temporal convolutional network