• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2007, Vol. 29 ›› Issue (10): 70-72.

• 论文 • 上一篇    下一篇

一种改进的基于FP-树的最大目标频繁项集挖掘算法

梁碧珍[1,2] 陆月然[1,3] 秦亮曦[1]   

  • 出版日期:2007-10-01 发布日期:2010-06-02

  • Online:2007-10-01 Published:2010-06-02

摘要:

目前,基于FP-树的最大频繁项集挖掘算法存在的一个问题是FP-树的规模过大,遍历树需耗费大量的运行时间,并且挖掘出来的很多频繁项集是用户不感兴趣的,过多的无用频繁  模式影响了挖掘的效率。本文提出一种排序紧缩非冗余的STFP-树,以及基于STFP-树的最大目标频繁项集挖掘算法STFP-MAX。该算法在满足用户需求的基础上有效地缩小了FP- -树的规模,又加快了搜索的速度,从而提高了挖掘的效率。

关键词: 关联规则 频繁项集 最大目标频繁项集 FP-树

Abstract:

At present, one of the problems which exist in FP-tree based maximal frequent item.set mining algorithms is that the FP-tree is too large, and it willcost a lot of time to traverse the tree, and many mined frequent item.sets are not interested by the users. The redundant frequent patterns affect the  mining efficiency. In this paper,we present a sorted, com- pressed and non-redundant target frequent pattern tree (abbreviated to STFP-tree). The sizee of the STFP-tree is significantly reduced, and on the basis of the STFP-tree, we put forward a maximal target frequent itemset mining algorithm, which  can satisfies the users'requirements and accelerates the speed to traverse the tree, so the mining efficiency is improved in the algorithm.

Key words: (association rule, frequent itemset, maximal target frequent itermset, FP-tree)