• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (9): 15-19.

• 论文 • 上一篇    下一篇

多核平台PAML并行算法研究

杨菊,吴卓锋,王刚,刘晓光   

  1. (南开大学计算机与控制工程学院,天津 300071)
  • 收稿日期:2013-03-13 修回日期:2013-07-04 出版日期:2013-09-25 发布日期:2013-09-25

Parallel algorithm for PAML on multicore platform        

YANG Ju,WU Zhuofeng,WANG Gang,LIU Xiaoguang   

  1. (College of Computer and Control Engineering,Nankai University,Tianjin 300071,China)
  • Received:2013-03-13 Revised:2013-07-04 Online:2013-09-25 Published:2013-09-25

摘要:

PAML是一款利用最大似然法进行系统发育分析的软件包,被广泛使用。然而,由于模型复杂、参数众多,PAML的计算过程非常耗时。对PAML中最重要的codeml程序进行了并行算法研究,通过算法分析和程序Profiling确定程序瓶颈。在此基础上,利用现代CPU的多核并行能力和SIMD并行机制优化程序瓶颈,从而提高了程序整体的运行速度。实际数据集和人工数据集上的实验表明并行算法有效提高了codeml的计算速度,加速比最高达7.94倍。

关键词: 并行算法, PAML, 多核CPU, 单指令流多数据流

Abstract:

PAML, which is widely used, is a package of programs for maximum likelihood phylogenetic analysis of protein and DNA sequences. However, its calculation is time consuming because of too many parameters and complex models. The paper focus on the most important part in PAML: codeml. After analyzing the algorithm and finding program bottleneck by profiling, we parallelize the bottlenecks based on the capacity of modern multicore CPU and SIMD technology in order to reduce the overall operating time. The experimental results on both real and simulated data show that our parallel algorithm for PAML can improve the calculation efficiency. The final speedup is up to 7.94.

Key words: parallel algorithm;PAML;multicore CPU;SIMD