• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (02): 265-275.

• 计算机网络与信息安全 • 上一篇    下一篇

基于深度神经网络的隐私保护基因检测

黄颖1,2,3,唐敏1,2,3   

  1. (1.桂林电子科技大学数学与计算科学学院,广西 桂林 541004;
    2.广西应用数学中心(桂林电子科技大学),广西 桂林 541004;
    3.广西高校数据分析与计算重点实验室,广西 桂林 541004)
  • 收稿日期:2023-09-12 修回日期:2024-01-10 接受日期:2025-02-25 出版日期:2025-02-25 发布日期:2025-02-24
  • 基金资助:
    国家自然科学基金(62462018);广西数字基础设施重点实验室开放基金资助课题(GXDIOP2024017);广西研究生教育创新计划(JGY2024155)

Privacy-preserving gene testing based on deep neural network

HUANG Ying1,2,3,TANG Min1,2,3   

  1. (1.School of Mathematics & Computing Science,Guilin University of Electronic Technology,Guilin 541004;
    2.Center for Applied Mathematics of Guangxi,Guilin University of Electronic Technology,Guilin 541004;
    3.Guangxi Colleges and Universities Key Laboratory of Data Analysis and Computation,Guilin 541004,China)
  • Received:2023-09-12 Revised:2024-01-10 Accepted:2025-02-25 Online:2025-02-25 Published:2025-02-24

摘要: 深度神经网络(DNN)功能强大,被广泛应用于生物医学中的基因检测。构建可靠的DNN模型需要大量有效医疗样本,而现实中生物数据通常分散存储且具有高度隐私。现有方案在处理此类分布式大规模的复杂学习任务时,难以在实现数据安全的同时保证DNN模型的高精度。为此,提出一个基于DNN的隐私保护方案,联合多方数据快速构建起精确的基因检测模型。首先,使用盲化矩阵结合内积函数加密消除全同态、秘密共享等方案中需要的近似替换策略,确保在隐私保护的同时,实现与明文集中式训练一致的效果。其次,构造非交互式训练模式抵抗全局模型参数泄漏造成的推断攻击,保证数据安全。在真实医疗数据集上的实验结果表明了所提方案的正确性、有效性和高精度。

关键词: 深度神经网络, 基因检测, 隐私保护, 盲化矩阵, 内积函数加密

Abstract: Deep neural network (DNN) is powerful and widely used for gene testing tasks in biomedi- cal fields. Building a reliable DNN model requires a large number of valid medical samples, while in reality, biological data with high privacy are usually stored in a decentralized manner. Existing solutions struggle to achieve both data security and high model accuracy when dealing with such distributed and large-scale complex learning tasks. To mitigate this problem, a novel privacy-preserving scheme based on the DNN model is proposed, which combines multiple data sources and quickly constructs a high- precision gene testing model. Firstly, the mask matrix is combined with the functional encryption for inner product to eliminate the approximate substitution strategies required in schemes such as fully homomorphic and secret sharing, thereby achieving consistency between the privacy-preserving and the centralized DNN training. Secondly, a non-interactive DNN training mode is constructed to resist the inference attacks caused by global model parameters leakage, ensuring the security of data. Experimental results on real medical datasets demonstrate the correctness, effectiveness, and high accuracy of the proposed scheme. 


Key words: deep neural network, gene testing, privacy-preserving, mask matrix, functional encryption for inner product