• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

3D-MMA:Matrix multiplication accelerator
architecture based on 3D integrated circuits

WANG Ji-jun,HAO Zi-yu,LI Hong-liang   

  1. (Jiangnan Institute of Computing Technology,Wuxi 214083,China)
     
  • Received:2019-07-06 Revised:2019-09-16 Online:2019-12-25 Published:2019-12-25

Abstract:

With regular dataflow and large throughput, systolic array is widely used for designing high performance convolution and matrix multiplication accelerators. In the deep submicron process, extending the processing array size can improve the chip computation performance, but lead to frequency decrease and sharp power consumption increase. Therefore, based on 3D integrated circuits technology, we propose a double-precision floating-point matrix multiplication accelerator named 3D-MMA, which maps planar systolic arrays onto 3D integrated circuits. Firstly, we propose an efficient matrix multiplication scheduling algorithm for 3D-MMA. Secondly, we present an acceleration system based on 3D-MMA, and build an analytical performance model to quantitatively explore the design space. Finally, we evaluate the 3D-MMA implementation cost and compare the proposal with other existing advanced accelerators. The experimental results show that the integrated circuits with 4-layer 16×16 systolic array can reach up to 3 TFLOPS, its efficiency reach up to 99%, and its implementation cost is less than the planar solution. Under the same process, compared with linear array accelerator and K40 GPU, the performance of 3D-MMA is 1.36 and 1.92 times that of the latter, and its area is much smaller than that of the latter. This paper explores the advantages of 3D integrated circuits in designing high-performance matrix multiplication accelerators, which has certain reference for further improving performance of high-performance platforms in the future.

 

Key words: 3D integrated circuits, matrix multiplication accelerator, blocking algorithm, performance model