• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (06): 976-987.

• 高性能计算 • 上一篇    下一篇

一种多端口寄存器文件的全自动物理编译器

明天波1,刘必慰1,2,胡春媚1,2,吴振宇1,2,宋睿强1,2,宋芳芳1   

  1. (1.国防科技大学计算机学院,湖南 长沙 410073;
    2.先进微处理器芯片与系统重点实验室,湖南 长沙 410073)
  • 收稿日期:2024-06-01 修回日期:2024-08-15 出版日期:2025-06-25 发布日期:2025-06-26

An automated physical compiler for multi-port register files

MING Tianbo1,LIU Biwei1,2,HU Chunmei1,2,WU Zhenyu1,2,SONG Ruiqiang1,2,SONG Fangfang1   

  1. (1.College of Computer Science and Technology, National University of Defense Technology, Changsha 410073; 
    2. Key Laboratory of Advanced Microprocessor Chips and Systems, Changsha 410073, China)

  • Received:2024-06-01 Revised:2024-08-15 Online:2025-06-25 Published:2025-06-26

摘要: 在专用微处理器设计中,设计师需反复尝试不同的体系结构参数以实现最佳应用支持。多端口寄存器文件作为核心部件,仍采用全定制或传统编译器辅助设计,但是这2种方法往往难以兼顾高性能需求与设计灵活性,因此难以与体系结构联合优化。提出一种用于多端口寄存器文件的物理编译器,可以全自动地快速生成指定容量和端口数的寄存器文件电路与版图。此外,还提出了优化的端口结构,以提升寄存器文件的并行访问性能;并提出了性能驱动的启发式算法,以实现优化的布局布线结果。使用所提出的编译器生成寄存器文件的时间约为数十小时,满足联合优化需求。与全定制设计相比,所提编译器速度提升了31.5%,功耗降低了28.8%;与传统编译器辅助设计相比,所提编译器速度提升了20.7%,功耗降低了33.9%。

关键词: 多端口寄存器文件, 物理编译器, 端口优化技术, 启发式算法, 计算机体系结构

Abstract: In the design of application-specific microprocessors, designers need to  iteratively  experiment with different architectural parameters to achieve optimal application support. Multi-port register files, as core components, still rely on full-custom design or traditional compiler-assisted design. However, these methods often struggle to balance high performance requirements with design flexibility, making it difficult to achieve co-optimization with the architecture. This paper proposes a physical compiler for multi-port register files, which can automatically and quickly generate register file circuits and layouts with specified capacity and port count. Additionally, this paper proposes an optimized port structure to enhance the parallel access performance of the register file and a performance-driven heuristic algorithm to achieve optimized placement and routing results. Experimental results show that the proposed compiler can generate register files in approximately  tens of hours to meet co-optimization requirements, achieving 31.5% speed improvement and 28.8% power reduction compared to full-custom designs, as well as 20.7% higher speed and 33.9% lower power consumption relative to traditional compiler-assisted designs.

Key words: multi-port register file, physical compiler, port optimization technique, heuristic algorithm, computer architecture