• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (02): 200-209.

• 高性能计算 • 上一篇    下一篇

基于国产异构众核处理器的等值线与等值面提取算法优化

张元胤,肖敏广,刘志勇,翁灵玲,陈志广,卢宇彤   

  1. (中山大学计算机学院,广东  广州 511400)
  • 收稿日期:2023-09-25 修回日期:2023-11-20 接受日期:2025-02-25 出版日期:2025-02-25 发布日期:2025-02-21
  • 基金资助:
    国家重点研发计划(2021YFB0300103);国家自然科学基金(62272499);广东特支计划(2021TQ06X160)

Optimization of isoline and isosurface extraction algorithm based on domestic heterogeneous many-core processors

ZHANG Yuanyin,XIAO Minguang,LIU Zhiyong,WENG Lingling,CHEN Zhiguang,LU Yutong   

  1. (School of Computer Science and Engineering,Sun Yat-sen University,Guangzhou 511400,China)
  • Received:2023-09-25 Revised:2023-11-20 Accepted:2025-02-25 Online:2025-02-25 Published:2025-02-21

摘要: MT-3000是由国防科技大学面向下一代超级计算机设计的国产异构众核处理器,具有优越的计算能力,可以有效加速可视化数据处理。等值线和等值面提取是标量场数据最常用的几何可视化方法,但现有的提取算法通常仅面向通用CPU或GPU。在MT-3000处理器上,由于片上缓存空间有限,从核访存带宽限制等问题,导致计算效率低下;另外,由于编程模型的特殊性,现有软件与方法无法直接在MT-3000上运行。为了充分发挥国产超算系统在可视化领域的计算效能,基于MT-3000的微体系结构对等值线网格序列算法和等值面移动立方体算法分别提出了新的并行化算法。新方法采用向量指令、流水线实现存算重叠等技术,更加适应异构众核架构,从而达到加速算法执行的目的。实验结果表明,2种算法的加速比均达到4以上,并且随着从核的增多,算法的执行时间近呈线性下降,这证明所提算法具有良好的可扩展性。

关键词: 数据过滤, 等值线, 等值面, 并行计算, 异构, 众核, 国产超算系统

Abstract: The MT-3000 is a domestic heterogeneous many-core processor designed by the National University of Defense Technology for the next generation of supercomputers. It has superior computing power and can effectively accelerate data processing in visualization. Isoline and isosurface extraction is the most common geometric visualization method for scalar field data. However, existing extraction algorithms typically target general CPUs or GPUs. On MT-3000 processors, the computing efficiency is low due to the limited cache space on-chip, bandwidth throttling of memory access from the cores, etc. In addition, due to the unique nature of programming models, existing software and methods are unable to run on MT-3000 processors directly. In order to fully utilize the computational efficiency of the domestic supercomputing systems in the field of visualization, this paper implements a new parallelization algorithm of the grid scan algorithm for isoline extraction and the marching cubes algorithm for isosurface extraction based on the hardware characteristics of MT-3000. Techniques such as vector instructions and pipeline implementation are used to better adapt to the many-core architecture, thus achieving the goal of improving performance. The experimental results show a speedup of over 4, and the execution time of both the algorithms decreases nearly linearly while increasing cores, which proves the scalability of the algorithms. 

Key words: data filtering, isoline, isosurface, parallel computing, heterogeneous, many-core, domestic supercomputing system