• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2014, Vol. 36 ›› Issue (02): 354-358.

• 论文 • 上一篇    下一篇

一种基于向量的概率加权关联规则挖掘算法

赵志刚,万军,王芳   

  1. (广西大学计算机与电子信息学院,广西 南宁 530004)
  • 收稿日期:2012-06-28 修回日期:2012-11-29 出版日期:2014-02-25 发布日期:2014-02-25
  • 基金资助:

    国家自然科学基金资助项目(60973074)

A probability weighted association rules mining algorithm based on vector        

ZHAO Zhigang,WAN Jun,WANG Fang   

  1. (College of Computer and Electronics Information,Guangxi University,Nanning 530004,China)
  • Received:2012-06-28 Revised:2012-11-29 Online:2014-02-25 Published:2014-02-25

摘要:

关联规则挖掘是数据挖掘领域中最活跃的一个分支。目前提出的许多关联规则挖掘算法需要多次扫描数据库并产生大量候选项集,影响了挖掘效率。针对加权关联规则挖掘算法中多次扫描数据库影响算法性能的问题,对其进行了优化,采取了以空间换时间的思路,提出一种基于向量的概率加权关联规则挖掘算法。以求概率的方式设置项目属性的权值,通过矩阵向量存储结构保存事务记录,只需扫描一次数据库,并且采用不同的剪枝策略及加权支持度和置信度的计算方式。使用数据实例进行模拟实验,结果表明此算法明显提高了挖掘效率。

关键词: 数据挖掘, 概率, 向量, 加权关联规则, 剪枝策略

Abstract:

Association rules mining is one of the most active branch of data mining. Many association rules mining algorithms need to scan the database many times and produce a large number of candidate items. Aiming at the problem of scanning database several times, a probabilityweighted association rules mining algorithm based on vector is proposed. It sets the weight value of item by computing the probability and saves the transaction records via the matrixvector structure by scanning the database only once. In addition, it employs different cutting strategies and computing ways of weighted support and confidence. Experimental results show that the new algorithm can improve the mining efficiency distinctly.

Key words: data mining;probability;vector;weighted association rules;cutting strategies