• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (12): 2105-2114.

• 高性能计算 • 上一篇    下一篇

OpenCL计算软件栈评估

朱浩1,2,周博洋2,卢雪山3,杜溢墨4   

  1. (1.军事科学院国防科技创新研究院,北京 100000;2.国防科技大学计算机学院,湖南 长沙 410073;

    3.空军后勤部,北京100000; 4.31008部队,北京100091)

  • 收稿日期:2020-11-20 修回日期:2021-03-02 接受日期:2021-12-25 出版日期:2021-12-25 发布日期:2021-12-21
  • 基金资助:
    国家自然科学基金(61802416,61972408,61902407)

Evaluation of OpenCL computing software stack

ZHU Hao1,2,ZHOU Bo-yang2,LU Xue-shan3,DU Yi-mo4   

  1. (1.Defense Innovation Institute,Academy of Military Sciences,Beijing 100000;

    2.College of Computer Science and Technology,National University of Defense Technology,Changsha 410073;

    3.PLA Air Force Logistics Department,Beijing 100000;4.Troop 31008,Beijing 100091,China)


  • Received:2020-11-20 Revised:2021-03-02 Accepted:2021-12-25 Online:2021-12-25 Published:2021-12-21

摘要: 随着智能计算和大数据应用的发展,人们对GPU等加速部件的需求不断增长。计算软件栈比如CUDA、OpenCL软件栈是能充分发挥GPU硬件性能的关键。考虑计算软件栈未来在国产基础软硬件平台(比如飞腾CPU和麒麟操作系统)上的可移植性和适配性,重点研究OpenCL开源计算软件栈。测试分析OpenCL应用在不同平台上的表现,评估应用在不同OpenCL软件栈上
(比如Mesa、ROCm等)进行GPU计算的表现,评估软件栈中驱动、内核等对GPU计算的影响,并且整个测试涵盖了编译、数据传输和内核执行等OpenCL计算各个阶段的时间开销。经过测试评估发现,国产平台更迫切也更适合使用GPU进行加速计算,ROCm是比较理想的OpenCL开源软件栈,有较好的性能和稳定性,并且与闭源软件栈相比存在一定的优化空间。


关键词: OpenCL, 计算软件栈, GPU计算, 国产基础软硬件平台

Abstract: With the development of intelligent computing and big data applications, the demand for accelerators such as GPU is increasing. Computing software stacks such as CUDA and OpenCL software stacks are the key to making full use of GPU hardware performance. Considering the portability and implementation of software stacks on domestic fundamental OS and hardware platforms (such as Phytium CPU and Kylin OS) in future, this paper focuses on open-source OpenCL software stacks. The performance of OpenCL applications on different platforms is tested and analyzed. The performance of GPU computing on different OpenCL software stacks, such as Mesa, ROCm, etc., is evaluated. The impact of drivers and kernels in the software stack on GPU computing is evaluated. The entire test covers the time overhead of various stages of OpenCL calculations such as compilation, data transmission, and kernel execution. The test and evaluation found that it is more urgent and more suitable to use GPU for accelerated computing on domestic platforms. ROCm is an ideal OpenCL open source software stack with better performance and stability, and can be further optimized compared with close-source software stacks. 


Key words: OpenCL, computing software stack, GPU computing, domestic fundamental software and hardware platform