• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (11): 27-33.

• 论文 • Previous Articles     Next Articles

Clusteringbased largescale haplotype phasing algorithm 

PAN Weihua,CHEN Bo,XU Yun   

  1. (1.School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027;
    2.Key Laboratory of High Performance Computing,Hefei 230027,China)
  • Received:2013-08-12 Revised:2013-10-11 Online:2013-11-25 Published:2013-11-25

Abstract:

Largescale haplotype phasing is an important fundamental problem in genetic analysis. To overcome the weakness of existing algorithms, we introduce the concept of clustering into original WinHAP algorithm and propose the Clutering based WinHAP algorithm. This algorithm improves original WinHAP in computing speed and memory without decreasing the precision, and its memory has nothing to do with the number of sequences. Thus, it is suited to very large datasets. The algorithm is parallelized under SIMD shared memory model and greedy task designing strategy is devised. The experiment reveals a nearlinear speedup with respect to the sequential algorithm.

Key words: haplotype phasing;clustering;largescale computing;parallel computing;bioinformatics