改进的多维关联规则算法研究及应用

J4 ›› 2012, Vol. 34 ›› Issue (9): 174-179.

改进的多维关联规则算法研究及应用

张素琪1，梁志刚2，胡利娟2，董永峰2

(1.天津大学,天津 300072；2.河北工业大学计算机科学与软件学院,天津 300130)
(

收稿日期:2011-07-19 修回日期:2011-10-28 出版日期:2012-09-25 发布日期:2012-09-25
基金资助:
天津市自然科学基金资助项目（10JCZDJC16000）

Research and Application of Improved Multidimensional Association Rule Mining Algorithms

ZHANG Suqi1,LIANG Zhigang2,HU Lijuan2,DONG Yongfeng2

1.Tianjin University,Tianjin 300072;
2.School of Computer Science and Software,Hebei University of Technology,Tianjin 300130,China)

Received:2011-07-19 Revised:2011-10-28 Online:2012-09-25 Published:2012-09-25

摘要/Abstract

摘要：

关联规则是数据挖掘研究中最主要、最活跃的领域之一。以Apriori算法为前提，借助AprioriTid算法事务压缩的思想，减少了重复扫描数据库的时间；并提出了一种利用事务标识列表，该列表长度即是对应候选项集的支持度计数，在计算支持度计数时，仅需要得到对应列表长度即可，从而缩短了计算计数时的比较时间；同时，在生成频繁项集时引入地址索引机制，在剪枝过程中，利用候选项集的首元素在地址索引表中快速定位，减少了多次扫描事务数据库，有效地缩短了计数时间和占用的内存空间。利用改进的算法对科研管理系统数据进行关联关系分析，从中萃取数据中隐含的、有价值的信息，辅助下一阶段的科研管理工作。并通过试验进行性能比较得出，改进后的算法效率更高。

关键词: 关联规则, 数据挖掘, Apriori算法, 地址索引

Abstract:

The field of data mining association rules is one of the most important and active areas .Taking the Apriori algorithm as a premise , using the Affairs compression idea of AprioriTid algorithms, we reduce the duplication of time scanning the database. We put forward a kind of Apriori algorithm based on the identifier lists of transactions in the database, and the list length is the candidate sets’ corresponding support count. For getting the support count in the calculation, we only need to count the length of the list, thereby reducing the calculation time. At the same time, introducing the address indexing mechanism when generating frequent itemsets in the pruning process, we use the first set of candidate elements in the address table index to quickly locate, and thus reduce the number of scanning the transaction database. We make use of the business address index table to improve the counting time and execution efficiency of algorithms.The data of scientific research management as the research object, we use the improved algorithms to analyze the data of relationship, moreover, to extract the data’s hidden ,valuable information, and support the next phase of scientific research management. The experiments show that the algorithm is more efficient.

Key words: association rule;data mining;apriori algorithm;allocation index

张素琪1，梁志刚2，胡利娟2，董永峰2. 改进的多维关联规则算法研究及应用[J]. J4, 2012, 34(9): 174-179.

ZHANG Suqi1,LIANG Zhigang2,HU Lijuan2,DONG Yongfeng2. Research and Application of Improved Multidimensional Association Rule Mining Algorithms[J]. J4, 2012, 34(9): 174-179.

编辑推荐

Metrics

阅读次数

全文

234

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	234

来源	本网站	其他网站

次数	150	84
比例	64%	36%

摘要

最新录用	在线预览	正式出版

0	0	68

	来源	本网站

	次数	68
	比例	100%

[1]	赵琰, 马慧芳, 王文涛, 童海斌, 贺相春. 可靠响应表示增强的知识追踪方法[J]. 计算机工程与科学, 2024, 46(03): 535-544.
[2]	雷轩, 程光, 张玉健, 郭靓, 张付存. 基于电力网络态势感知平台的告警信息关联分析[J]. 计算机工程与科学, 2023, 45(07): 1197-1208.
[3]	王晨宇, 温浩珉, 郭晟楠, 林友芳, 万怀宇, . 面向快递员揽收到达时间预测的多任务深度时空网络[J]. 计算机工程与科学, 2023, 45(01): 136-144.
[4]	程小刚, 郭韧, 周长利, . 基于理性密码学的分布式隐私保护数据挖掘框架[J]. 计算机工程与科学, 2022, 44(10): 1781-1787.
[5]	王文涛, 马慧芳, 舒跃育, 贺相春. 基于上下文表示的知识追踪方法[J]. 计算机工程与科学, 2022, 44(09): 1693-1701.
[6]	刘云, 肖添. 网络日志数据中条件因果挖掘算法的优化研究[J]. 计算机工程与科学, 2021, 43(09): 1584-1590.
[7]	文凯, 许萌萌, 张许红, . 基于列表结构的加权可擦除项集挖掘算法[J]. 计算机工程与科学, 2021, 43(09): 1676-1683.
[8]	熊中敏, 汪博, 陶然, 郑宗生, 陈明, . 一种基于主属性判定的关联规则挖掘约简算法[J]. 计算机工程与科学, 2021, 43(04): 738-745.
[9]	藏润强, 左美云, 郭鑫鑫. 基于Doc2Vec和BiLSTM的老年患者疾病预测研究[J]. 计算机工程与科学, 2020, 42(12): 2273-2279.
[10]	杨岚雁, 靳敏, 张迎春, 张珣. 一种基于关联规则的MLKNN多标签分类算法[J]. 计算机工程与科学, 2020, 42(07): 1309-1317.
[11]	何望1,2，林果园1,2. 基于FP-Growth改进算法的云服务器故障数据分析[J]. 计算机工程与科学, 2020, 42(05): 770-775.
[12]	谭胜昔，贾金萍，赵斌，吉根林. 动态空间网络中的黑洞模式挖掘算法[J]. 计算机工程与科学, 2020, 42(02): 325-333.
[13]	杨青1,2,3，张亚文1,2，张琴1，袁佩玲1. 基于Hadoop的多维关联规则挖掘算法研究及应用[J]. 计算机工程与科学, 2019, 41(12): 2127-2133.
[14]	廖纪勇，吴晟，刘爱莲. 基于布尔矩阵约简的Apriori算法改进研究[J]. 计算机工程与科学, 2019, 41(12): 2231-2238.
[15]	何登平1，2，3，何宗浩1,2，李培强1,2. 基于Spark的并行化高效用项集挖掘算法[J]. 计算机工程与科学, 2019, 41(10): 1723-1730.