• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2005, Vol. 27 ›› Issue (9): 62-63.

• 论文 • 上一篇    下一篇

一种改进的频繁无规则集集合开采算法

赵栋 卢炎生   

  • 出版日期:2005-09-01 发布日期:2010-07-03

  • Online:2005-09-01 Published:2010-07-03

摘要:

数据挖掘的一个基本任务是在海量数据的数据库中开采频繁项目集。本文提出了一种方法,不用开采频繁项目集全集,而是开采它的一个称为频繁无规则集集合的精简集。我们能用频繁无规则集集合还原出完整的频繁项目集集合和它们的精确支持度而不用读取数据库。可以看到,对频繁无规则集集合的开采是高效的。我们给出了一个算法HOPE- Ⅲ来开采频繁无规则集集合,并将它和算法A-Close进行了比较。实验结果显示,HOPE-Ⅲ在任何情况下都比A-Close的性能更好。

关键词: 数据挖掘 精简集 频繁项目集 无规则集

Abstract:

Given a large collection of transactions containing items, a basic common problem is to extract the so-called frequent itemset. The idea presented in  this paper is to extract a condensed representation of the frequent itemsets called rule-free sets, instead of extracting the whole frequent itemset col lection. We show that this condensed representation can be used to regenerate all frequent patterns and their exact frequencies without any access to th e original data. An algorithm named HOPE-Ⅲ is given to extract the frequent rule-free sets. We compared it with an algorithm named A-Close which extrac  cts another condensed representation of frequent itemsets previously investigated in the literature called frequent closed sets. The experiments show th at in all cases, HOPE-Ⅲ is much more efficient than A-Close.

Key words: (data mining, condensed representation, frequent itemset, rule-free set)