• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2023, Vol. 45 ›› Issue (07): 1149-1158.

• High Performance Computing • Previous Articles     Next Articles

A deep learning programming framework for FT-Matrix DSP+MatrixZone heterogeneous systems

KANG Yu-han1,SHI Yang2,CHEN Zhao-yun2,WEN Mei2   

  1. (1.School of Information Science and Engineering,Hunan Normal University,Changsha 410081;
    2.College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)
  • Received:2023-01-01 Revised:2023-03-27 Accepted:2023-07-25 Online:2023-07-25 Published:2023-07-11

Abstract: To meet the fast iteration speed and high computing power requirements of deep learning models, mainstream hardware vendors are increasingly inclined towards heterogeneous systems consisting of general-purpose processors and AI-specific accelerator cores. However, AI-specific accelerator cores only support certain core operators and do not have general programming capabilities. Therefore, how to efficiently deploy deep learning tasks on such heterogeneous architectures is worth further research. Based on the domestically developed FT-Matrix DSP+MatrixZone heterogeneous system platform, this paper designs and implements a deep learning programming framework, called KaiSa. KaiSa analyzes the input parameters of the deep learning model, identifies the operator type, and assigns it to the corresponding computing core. For complex operators, KaiSa automatically completes the optimal search for the block size based on a performance model, improving the performance of dual-core parallel computing. At the same time, KaiSa shields all low-level hardware details to provide users with a friendly programming environment for efficient program development. Experimental results show that KaiSa can achieve performance improvements of up to 39.0%.

Key words: deep learning;FT-Matrix, MatrixZone;heterogeneous system;performance optimization