• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (11): 160-167.

• 论文 • 上一篇    下一篇

GPU集群上的三维UPML-FDTD算法的实现及优化

徐磊,徐莹,蒋荣琳,张丹丹   

  1. (上海超级计算中心,上海 201203)
  • 收稿日期:2013-08-05 修回日期:2013-10-10 出版日期:2013-11-25 发布日期:2013-11-25
  • 基金资助:

    国家863计划资助项目(2012AA01A308)

Implementation and optimization of three-dimensional
UPML-FDTD algorithm on GPU cluster        

XU Lei,XU Ying,JIANG Rong-lin,ZHANG Dan-dan   

  1. (Shanghai Supercomputer Center,Shanghai 201203,China)
  • Received:2013-08-05 Revised:2013-10-10 Online:2013-11-25 Published:2013-11-25

摘要:

在高性能计算领域,拥有强大浮点计算能力的协处理器正在快速发展。近年来,利用协处理器(如GPU)来加速时域有限差分FDTD算法的计算过程成为电磁研究领域的热点问题。在GPU集群上实现了三维UPML-FDTD算法并进行了优化。采用电偶极子激励源对算法的模拟结果同解析解进行了验证,结果表明该算法具有较高的精度;同时,在NVIDIA Tesla M2070和K20m GPU集群上对FDTD算法的性能进行测试,对优化前后的计算结果以及GPU与CPU的计算性能进行了比较,并使用80块NVIDIA  Tesla K20m GPU进行了可扩展性测试。从本文的研究结果可以看出,经过优化的FDTD算法性能有了较大的提升,而且FDTD算法在GPU集群上获得了比较理想的并行效率。

关键词: FDTD, UPML, GPU集群, MPI

Abstract:

Co-processers with powerful floating-point performance have been developing rapidly in recent years and also draw huge attention in the High Performance Computing (HPC) community. The acceleration of three-dimensional UPML Finite Difference Time Domain (FDTD) method  by using GPU co-processer becomes a hot topic in the numerical simulation of electromagnetic (EM). The paper focuses on the implementation and optimization of 3D UPML-FDTD algorithm on GPU clusters. Using the electric dipole excitation source, the proposed algorithm validates the numerical results of EM simulation and the analytical solution to the EM field, showing that the algorithm has high numerical accuracy. The performance of the parallel FDTD algorithm is tested on both Tesla M2070 and K20m GPU clusters, the results with/without optimization are compared, and the computing performance of GPU is compared with that of CPU. The scalability of the algorithm is shown for up to 80 Tesla K20m GPUs. It is concluded that the optimized FDTD algorithm improves its performance and obtain good parallel efficiency on GPU clusters.

Key words: FDTD;UPML;GPU cluster;MPI