• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

A novel sequence alignment method for third-generation
 sequencing based on low frequency seeds

SONG Si-yi1,2,CHEN Hao-yu1,2,XU Yun1,2   

  1. (1.School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027;
    2.Key Laboratory of High Performance Computing of Anhui Province,Hefei  230027,China)
     
  • Received:2018-11-27 Revised:2019-02-27 Online:2019-09-25 Published:2019-09-25

Abstract:

With the development of sequencing technology, the third-generation sequencing has been widely used in genetic research. It can generate longer sequences but has a higher error rate. It is difficult to align sequences to the reference genome quickly and accurately. Existing methods utilizes seeds which are subsequences selected from test sequences to speed up the alignment process. However, the seed frequency is not fully considered, which results in a large time consumption in the stage of finding candidate regions. We therefore propose a sequence alignment method for third-generation sequencing based on low frequency seeds. Its key idea is a modified seed-voting strategy, which adopts frequency seeds for voting to reduce the time consumption for counting the votes. Moreover, the alignment method re-filters the candidate regions based on the position and the number of votes, further increasing the speed of alignment. Experimental results show that the method is about 3 times faster than existing methods while ensuring sensitivity and accuracy.

 

 

Key words: third-generation sequencing, single molecule real-time sequencing, sequence alignment, seed-and-extend method