• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2022, Vol. 44 ›› Issue (04): 707-712.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

Comparison and analysis of ED algorithm and SNP-index algorithm in calculating SNP sites——Take arabidopsis thaliana for example

GAN Qiu-yun   

  1. (School of Applied Science and Engineering,Fuzhou Institute of Technology,Fuzhou 350014,China)
  • Received:2020-05-15 Revised:2020-12-10 Accepted:2022-04-25 Online:2022-04-25 Published:2022-04-20

Abstract: SNP (Single Nucleotide Polymorphism) is the most common variation in biological heritable variation, which occurs between single nucleoside acid-base groups in DNA sequence. ED algorithm and SNP-index algorithm are two commonly used algorithms to calculate SNP sites. The whole genome sequencing data of F2 generation of arabidopsis thaliana are obtained by high-throughput sequencing. The sequencing data are filtered, screened and compared based on Linux platform. The number of SNP sites and the proportion of SNP genotypes detected under different algorithms are compared. The experimental results show that the number of SNP sites obtained by ED algorithm is more and more widely distributed than SNP index algorithm, and the relative distribution density is larger than that of SNP index algorithm, but the number of SNP sites and the proportion of SNP genotypes obtained by the two algorithms are similar.

Key words: single nucleotide polymorphism(SNP), biological information, ED algorithm, SNP-index algorithm