• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (8): 1331-1342.

• 高性能计算 • 上一篇    下一篇

面向隐私计算的模运算加速设计

刘宏伟1,支梁2,秦梦远1,陈铭志1,董文阔1,郝沁汾1   

  1. (1.中国科学院计算技术研究所,北京 100095;2.无锡芯光互连技术研究院,江苏 无锡 214100)
  • 收稿日期:2024-10-08 修回日期:2024-11-01 出版日期:2025-08-25 发布日期:2025-08-26
  • 基金资助:
    基金项目:国家重点研发计划(2022YFB4401501);江苏省重点研发计划(BE20230064)

Design of modular arithmetic acceleration for privacy computing

LIU Hongwei1,ZHI Liang2,QIN Mengyuan1,CHEN Mingzhi1,DONG Wenkuo1,HAO Qinfen1   

  1. (1.Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100095;
    2.Wuxi Institute of Interconnect Technology,Wuxi 214100,China)
  • Received:2024-10-08 Revised:2024-11-01 Online:2025-08-25 Published:2025-08-26

摘要: 隐私计算技术是数据中心保证数据安全的重要手段,随着量子计算的发展,基于格的后量子算法和全同态加密算法逐步发展流行。在这些算法中,模运算都是广泛使用的非线性算子之一,主要用于避免计算过程中出现溢出。面向隐私计算及密码学应用中广泛使用的模运算问题,在FPGA平台上基于PCIe接口设计实现了一个软硬件协同加速设计,能够有效掩盖通信延迟,并支持高达2 048位的模运算,包括模乘和模幂运算,以服务于有隐私计算需求的数据中心场景。已有研究工作都仅关注模运算本身,而此软硬件协同框架则给出了一个完整的加速框架,不仅包含运算核心,还给出了数据和软硬件接口,并减少了通信延迟的影响。最后结合一个具体的运营商场景,实现了针对性的加速应用,通过实验验证了设计的性能优势。

关键词: 隐私计算;模乘;模幂;软硬件协同;RSA ,

Abstract: Privacy computing technology serves as a crucial means to ensure data security in data centers.With the advancement of quantum computing,lattice-based post-quantum algorithms and fully homomorphic encryption algorithms have gradually gained prominence.In these algorithms,modular arithmetic serves as one of the widely used nonlinear operations,primarily employed to prevent overflow during computations.This paper addresses the extensively utilized modular arithmetic in privacy computing and cryptographic applications,proposing a hardware-software co-design acceleration framework implemented on FPGA platforms via PCIe interfaces.The framework effectively masks communication latency and supports modular operations of up to 2 048 bits—including modular multiplication and modular exponentiation—to serve data center scenarios with privacy computing requirements.While existing researches primarily focus on modular operations themselves,our co-designed framework delivers a comprehensive acceleration solution that encompasses not only computational cores but also data interfaces,hardware-software interaction mechanisms,and optimized communication latency mitigation.Finally,we implement a tailored acceleration application for a specific telecom operator scenario,experimentally demonstrating the performance advantages of the proposed system.

Key words: privacy computing, modular multiplication, modular exponentiation, hardware-software co-design, RSA