• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Performance optimization  of stencil computation
on ARM64 multi-core microprocessor

FENG Lu-xia,LI Chun-jiang,HUANG Ya-bin   

  1. (College of Computer,National University of Defense Technology,Changsha 410073,China)
  • Received:2017-01-07 Revised:2017-03-20 Online:2017-05-25 Published:2017-05-25

Abstract:

Stencil computation is a class of important calculation kernels widely used in the field ranging from image and video processing to largescale scientific and engineering simulation and calculation. However, the evaluation of stencil computation on the ARM64 highperformance processor is rare. According to the features of AMCC XGENE2 and Phytium  FT1500A, we design an optimization method based on twodimension bound, which reduces the parallelism overheads of thread scheduling,and increases the Cache hit rate by the threadCPU bound and threaddatablock bound. Experimental results show that this method can improve the performance of the stencil calculation on ARM64 architecture, and the results of our kernel demonstrate the good scalability on the two ARM64 multicore microprocessor platforms.
 

Key words: stencil computation;ARM64;AMCC XGENE2;FT1500A;parallelism, thread bound