• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (11): 1964-1973.

• 计算机网络与信息安全 • 上一篇    下一篇

音频删除篡改的多阶段检测与多模态定位

张国富,王茹,苏兆品,岳峰,廉晨思,杨波   

  1. (1.合肥工业大学计算机与信息学院(人工智能学院),安徽 合肥 230601;
    2.工业安全与应急技术安徽省重点实验室(合肥工业大学),安徽 合肥 230601;
    3.音视频智能防识联合实验室,安徽 合肥 230009;
    4.智能互联系统安徽省实验室(合肥工业大学),安徽 合肥 230009;
    5.安徽省公安厅物证鉴定管理处,安徽 合肥 230000)

  • 收稿日期:2024-11-14 出版日期:2025-11-25 发布日期:2025-12-05
  • 基金资助:
    教育部人文社会科学研究规划基金(24YJA870011);安徽省自然科学基金(2208085MF166);安徽省重点研究与开发计划(202104d07020001)

Multi-stage detection and multimodal localization for audio deletion tampering

ZHANG Guofu,WANG Ru,SU Zhaopin,YUE Feng,LIAN Chensi,YANG Bo   

  1. (1.School of Computer Science and Information Engineering(School of Artificial Intelligence),
    Hefei University of Technology,Hefei 230601;
    2.Anhui Province Key Laboratory of Industry Safety & Emergency Technology
     (Hefei University of Technology),Hefei 230601;
    3.Joint Laboratory of Intelligent Prevention and Recognition of Audio and Video,Hefei 230009;
    4.Intelligent Interconnection System Anhui Provincial Laboratory (Hefei University of Technology),Hefei 230009;
    5.Anhui Provincial Public Security Department,Physical Evidence Identification and Management Division,Hefei 230000,China)
  • Received:2024-11-14 Online:2025-11-25 Published:2025-12-05

摘要:

音频删除篡改检测在数字音频鉴真领域面临严峻挑战,尤其是在反取证攻击下。针对删除篡改难以检测且定位困难的问题,提出了一种音频删除篡改的多阶段检测与多模态定位方法。首先,设计一种头文件信息分析方法,用以筛选出疑似存在头尾删除篡改的音频文件;其次,提出一种基于列平均的常数Q频谱草图特征,并设计一种基于深度残差收缩网络和注意力机制的中间删除篡改分类网络;再次,结合头文件信息分析与分类网络的检测结果,综合判断音频是否存在删除篡改;最后,对于检测到的中间删除篡改,提出一种基于小波包分析与多模态特征结合的定位方法。对比实验结果表明,所提方法可以实现头尾删除篡改的检测和中间删除篡改的精确定位,其中中间删除分类的准确率、精确率、召回率和F1分数均超过98%,并在面对常规信号处理攻击时展现出更强的鲁棒性与定位精度。

关键词: 音频盲取证, 删除篡改, 检测与定位, 深度残差收缩网络, 小波包重构

Abstract:

Audio deletion tampering detection faces severe challenges in the field of digital audio authentication, particularly under anti-forensic attacks. To address the difficulties in detecting and locating deletion tampering, a multi-stage detection and multimodal localization method for audio deletion tampering is proposed. Firstly, a header information analysis method is designed to screen out audio files suspected of undergoing header/footer deletion tampering. Subsequently, a column-average-based constant Q spectral sketch feature is introduced, along with a middle deletion tampering classification network that leverages a deep residual shrinkage network and an attention mechanism. Next, by integrating the results from header information analysis and the classification network, a comprehensive judgment is made on whether the audio deletion tampering has occurred. Finally, for detected middle deletion tampering, a localization method combining wavelet packet analysis with multimodal features is proposed. Comparative experimental results demonstrate that the proposed method can effectively detect header/footer deletion tampering and accurately locate middle deletion tampering. Specifically, the accuracy, precision, recall, and F1 score for middle deletion classification all exceed 98%, and the method exhibits enhanced robustness and localization accuracy when faced with conventional signal processing attacks.



Key words: audio blind forensics, delete tampering, detection and localization, deep residual shrinkage network, wavelet packet reconstruction