• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (09): 1521-1531.

• 高性能计算 • 上一篇    下一篇

ParM:基于国产处理器的异构并行编程模型

朱文龙,江嘉治,黄聃,肖 侬   

  1. (中山大学计算机学院,广东  广州 510006)
  • 收稿日期:2023-02-14 修回日期:2023-04-17 接受日期:2023-09-25 出版日期:2023-09-25 发布日期:2023-09-12
  • 基金资助:

    国家重点研发计划(2021YFB0301300);国家自然科学基金(U1811461);广东省基础与应用基础研究基金(2019B030302002);广东省引进创新创业团队(2016ZT06D211);广东省重点领域研发计划(2021B0101190003);之江实验室项目(2021KC0AB04)


ParM: A heterogeneous programming model for domestic processors

ZHU Wen-long,JIANG Jia-zhi,HUANG Dan,XIAO Nong   

  1. (School of Computer Science and Engineering,Sun Yat-sen University,Guangzhou 510006,China)
  • Received:2023-02-14 Revised:2023-04-17 Accepted:2023-09-25 Online:2023-09-25 Published:2023-09-12

摘要: 随着算力需求的增长,各种国产异构计算设备不断出现,这些设备都有其专用的编程模型,开发者需要根据不同设备的架构特点在专用的编程模型上进行开发,导致开发出的代码在设备间不具有可移植性。近年来国外已经出现了支持多种计算设备的统一异构并行编程模型,但针对国产设备的异构编程模型的研究和实现还比较少。针对该问题,开发了一套性能可移植的异构编程模型ParM。该编程模型以C++库的形式提供,屏蔽了大量的底层实现细节,降低了并行编程难度。该编程框架目前支持的后端设备有x86 CPU、NVIDIA GPU、华为鲲鹏处理器和华为昇腾AI处理器,并且对各种后端设备进行了性能优化。在各种设备上的性能测试表明,ParM编程模型的性能可以达到原始代码的90%以上。

关键词: 性能可移植, 并行编程模型, 高性能计算, 异构计算, 国产处理器

Abstract: With the increasing demand for computing power, various domestically produced heterogeneous computing devices have emerged. These devices have their specialized programming models, and developers need to develop based on the architecture characteristics of different devices using these dedicated programming models. Therefore, the code developed is not portable across devices. In recent years, unified heterogeneous parallel programming models that support various computing devices have appeared overseas, but there is still relatively little research and implementation of heterogeneous programming models for domestically produced devices. To address this issue, a performance-portable heterogeneous programming model called ParM has been developed. This programming model is provided in the form of a C++  library and shields many low-level implementation details, reducing the difficulty of parallel programming. The current backend devices supported by this programming framework include x86 CPUs, NVIDIA GPUs, Huawei Kunpeng processors, and Huawei Ascend AI processors. Performance optimizations have been carried out for these backend devices, and performance test on various devices has shown that the ParM programming model can achieve over 90% performance compared to native code.

Key words: performance portability, parallel programming model, high performance computing, heterogeneous computing, domestic processor