• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

A parallel dynamic bit vector based frequent
closed sequence pattern mining algorithm

CHEN Qian,LIU Yun,GAO Yuying   

  1. (Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
     
  • Received:2017-11-22 Revised:2018-01-24 Online:2018-10-25 Published:2018-10-25

Abstract:

For long sequence databases, which have high computational costs both in time and space, a mining model that is more efficient and compact and can extract information completely is a current research hotspot. We propose a parallel dynamic bit vector based frequent closed sequence pattern mining algorithm (PDBV-FCSP), which combines the multicore processor architecture with the DBV data structure to effectively speed up the processing speed of the sequence database. The search space is divided, and the closed check of the preprocessing sequence is executed as early as possible, which reduces the required storage space and the execution time of mining the frequent closed sequence mode, and overcomes the problems of communication overhead, synchronization and data replication of the existing parallel mining algorithms. The dynamic load balancing mechanism for job redistribution is used to solve the load balancing problem of workloads among processors, thus minimizing the idle CPU time. Simulation results show that, compared with the DBVVDF algorithm, the PDBVFCSP algorithm has better performance in terms of running time, memory usage and scalability. And when the core number increases, the performance is better.
 

Key words: data mining, closed-sequence mode, dynamic bit vector, multi-core processor, PDBV-FCSP algorithm