• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (10): 1726-1736.

• 高性能计算 • 上一篇    下一篇

面向FPGA模拟加速的状态存储和策略映射技术研究

荣培涛,曾坤,李开,张甜,王永文   

  1. (1.国防科技大学计算机学院,湖南 长沙 410073;2.先进微处理器芯片与系统重点实验室,湖南 长沙 410073)

  • 收稿日期:2024-11-06 修回日期:2024-12-06 出版日期:2025-10-25 发布日期:2025-10-28
  • 基金资助:
    高层次科技创新人才工程(22-TDRCJH-02-006)

Research on state storage and strategic mapping techniques for FPGA-accelerated simulation

RONG Peitao,ZENG Kun,LI Kai,ZHANG Tian,WANG Yongwen    

  1. (1.College of Computer Science and Technology,National University of Defense Technology,Changsha 410073;
    2.Key Laboratory of Advanced Microprocessor Chips and Systems,Changsha 410073,China)
  • Received:2024-11-06 Revised:2024-12-06 Online:2025-10-25 Published:2025-10-28

摘要: 随着处理器设计规模的不断增长,周期精确模拟技术面临挑战:传统的软件模拟器通常速度缓慢,而硬件仿真加速平台往往价格昂贵,这限制了大多数学术和工业研究团队的使用。使用FPGA来加速周期精确模拟被视为一种极具潜力的手段。近年来出现的利用FPGA进行模拟加速的开源平台FireSim,不仅整合了FPGA加速模拟领域之前的研究成果,还克服了一系列关键障碍。然而该方案仍存在FPGA资源利用率不足问题,尤其是模型映射后BRAM资源占用过多,限制了模拟规模的进一步扩展。为了解决这一问题,提出了新的FPGA模拟加速平台资源管理与优化技术,包括识别BRAM资源占用的自动化流程和2种映射策略:将占据BRAM的部件迁移到URAM,以减轻压力,同时通过分散重构和资源敏感映射实现资源均衡使用。这些技术使得单块FPGA上的仿真规模从16核增加到32核,理论上可扩展至64核,且几乎不损失原模型的模拟速度,有效增强了现有平台的模拟规模拓展性,对于推动FPGA加速技术在大规模全系统仿真场景下的应用具有重要意义。

关键词: 周期精确模拟, 全系统仿真, 模拟器, FPGA加速, FPGA资源优化

Abstract: With the continuous growth of processor design scale, cycle-accurate simulation technology is facing challenges.Traditional software simulators are usually slow, while hardware emulation acceleration platforms are often expensive, which limits the use of most academic and industrial research teams. Using FPGA to accelerate cycle-accurate simulation is regarded as a highly promising method. In recent years, FireSim, an open-source platform that uses FPGA for simulation acceleration, not only integrates previous research results in the field of FPGA-accelerated simulation, but also overcomes a series of key obstacles. However, this solution still has the problem of underutilization of FPGA resources, especially the excessive occupation of BRAM resources after model mapping, which limits the further expansion of simulation scale. To solve this problem, new resource management and optimization technologies for FPGA simulation acceleration platforms are proposed, including an automated process for identifying BRAM resource usage and two mapping strategies: migrating components occupying BRAM to URAM to reduce pressure, and achieving balanced resource utilization through distributed reconstruction and resource-sensitive mapping. These technologies increase the simulation scale on a single FPGA from 16 cores to 32 cores, and can theoretically be extended to 64 cores with almost no loss of simulation speed. They effectively enhance the simulation scale expandability of existing platforms and are of great significance for promoting the application of FPGA acceleration technology in large-scale full-system simulation scenarios.

Key words: cycle-accurate simulation, full system simulation, simulator, FPGA-accelerated, FPGA resource optimization