J4 ›› 2011, Vol. 33 ›› Issue (3): 41-45.doi: 10.3969/j.issn.1007130X.2011.
• 论文 • Previous Articles Next Articles
FANG Xudong,TANG Yuhua,WANG Guibin,TANG Tao
Received:
Revised:
Online:
Published:
Abstract:
With the fast development of GPUs, using them to accelerate scientific computing applications is becoming an inevitable trend. In this paper, we port two typical subroutines Rprj3 and Interp from Mgrid which contains rich stencil operations in SPEC2000 to run on an AMD GPU using Brook+. Using a thread granularity tuning mechanism provided by Brook+, we implement different ported program versions and analyze their performances. We also conclude how to utilize thread granularity tuning to optimize stencil program transplantation. Our experimental results show that under the largest problem size, Rprj3 obtains a speedup of 5.37 over its CPU version while Interp gains a speedup of 12.8 over its CPU version.
Key words: GPU;optimization;stencil
FANG Xudong,TANG Yuhua,WANG Guibin,TANG Tao. Implementation and Optimization of Stencil Applications on GPUs[J]. J4, 2011, 33(3): 41-45.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/10.3969/j.issn.1007130X.2011.
http://joces.nudt.edu.cn/EN/Y2011/V33/I3/41