• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

A top-k high utility itemset mining algorithm based on R-list

HE Dengping1,2,3,HE Zonghao1,2   

  1. (1.School of Telecommunications and Information Engineering,
    Chongqing University of Posts and Telecommunications,Chongqing 400065;
    2.Research Center of New Telecommunication Technology Applications,
    Chongqing University of Posts and Telecommunications,Chongqing 400065;
    3.Chongqing Information Technology Designing Company Limited,Chongqing 401121,China)
  • Received:2018-10-18 Revised:2018-11-30 Online:2019-07-25 Published:2019-07-25

Abstract:

Aiming at the problem that the existing one-phase top-k high utility itemset mining algorithm is slow to raise the threshold and generates a large number of candidate sets, thus occupying too large memory space during the iteration, we propose a top-k high utility mining algorithm RHUM based on reused list (R-list). This algorithm uses a new data structure called R-list to store and quickly access itemset information without having to scan the database a second time for mining. It reuses the memory to save the information of candidate sets, and preprocesses data jointly with the improved RSD threshold increment strategy. During the recursive search process, stricter pruning parameters are used to calculate the effect of multiple item sets simultaneously to narrow the search space. Experimental results on different types of data sets show that the RHUM is superior to other onephase algorithms in memory efficiency and stable under the change of K value.

Key words: high utility item set, one-phase mining, R-list, data mining, top-K