基于一阶谓词公式去除商务数据冗余关联规则的研究

计算机工程与科学

基于一阶谓词公式去除商务数据冗余关联规则的研究

郭瑞，钱晓东

（兰州交通大学电子与信息工程学院，甘肃兰州 730070）

收稿日期:2015-09-28 修回日期:2015-12-22 出版日期:2017-03-25 发布日期:2017-03-25
基金资助:
国家自然科学基金（71461017）

Removal of redundant association rules of business

data based on first-order predicate formula

GUO Rui，QIAN Xiao-dong

（School of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China）

Received:2015-09-28 Revised:2015-12-22 Online:2017-03-25 Published:2017-03-25

摘要/Abstract

摘要：

由于现代网络数据量的急速增长，利用现有的算法生成关联规则时，冗余规则的数量远远大于实际有价值的规则，冗余规则不仅影响用户分析，而且使关联规则的利用率也大大降低。针对关联规则的冗余问题，提出了一种基于一阶谓词公式去除商务数据冗余关联规则的方法，利用一阶谓词公式来表示关联规则，通过等价公式进行转换，并利用算法和矩阵等价将谓词公式转换为邻接矩阵，然后利用冗余规则算法进行删除。实验原始数据为UCI数据集，并利用Weka生成关联规则。最后利用Matlab和Java实现冗余规则的去除。

关键词: 关联规则, 一阶谓词公式, 关联矩阵, 邻接矩阵

Abstract:

Due to the rapid development of modern network data, the existing algorithms for generating association rules can bring in a number of redundant rules which are far greater than the actual value of the rules. The redundant rules not only affect user analysis, but also reduce the utilization of the association rules. In order to eliminate the redundant rules, we propose a method for removing redundant association rules of business data based on the first order predicate formula, which uses the first order predicate formula to represent the association rules through the conversion of the equivalent formula. And the predicate formula is converted to the adjacency matrix by using the algorithm and the matrix equivalence, and the redundant association rules are deleted. Experimental raw data is from the UCI data set, the association rules are generated by Weka, and then the redundant rules are removed by Java and MATLAB.

Key words: association rules, first-order predicate formula, incidence matrix, adjacency matrix

郭瑞，钱晓东. 基于一阶谓词公式去除商务数据冗余关联规则的研究[J]. 计算机工程与科学.

GUO Rui，QIAN Xiao-dong.

Removal of redundant association rules of business

data based on first-order predicate formula

[J]. Computer Engineering & Science.

[1]	杨春霞, 马文文, 徐奔, 韩煜, . 融合标签信息的分层图注意力网络文本分类模型[J]. 计算机工程与科学, 2023, 45(11): 2018-2026.
[2]	熊中敏, 汪博, 陶然, 郑宗生, 陈明, . 一种基于主属性判定的关联规则挖掘约简算法[J]. 计算机工程与科学, 2021, 43(4): 738-745.
[3]	杨岚雁, 靳敏, 张迎春, 张珣. 一种基于关联规则的MLKNN多标签分类算法[J]. 计算机工程与科学, 2020, 42(7): 1309-1317.
[4]	李凤英，杨恩乙，董荣胜. 图数据压缩技术综述[J]. 计算机工程与科学, 2020, 42(1): 89-97.
[5]	杨青1,2,3，张亚文1,2，张琴1，袁佩玲1. 基于Hadoop的多维关联规则挖掘算法研究及应用[J]. 计算机工程与科学, 2019, 41(12): 2127-2133.
[6]	廖纪勇，吴晟，刘爱莲. 基于布尔矩阵约简的Apriori算法改进研究[J]. 计算机工程与科学, 2019, 41(12): 2231-2238.
[7]	秦铭，蔡明. 基于分类融合和关联规则挖掘的图像语义标注[J]. 计算机工程与科学, 2018, 40(5): 950-956.
[8]	张稳，罗可. 一种基于Spark框架的并行FP-Growth挖掘算法[J]. 计算机工程与科学, 2017, 39(8): 1403-1409.
[9]	薛红，张鹏，李伟男，郑作文. 全渠道消费者行为协同决策研究[J]. 计算机工程与科学, 2017, 39(8): 1570-1575.
[10]	贾俊杰，陈菲. 数字图书馆个性化匿名发布方法[J]. 计算机工程与科学, 2017, 39(11): 2109-2114.
[11]	徐春，李广原，王玄，田换. 一种基于倒排索引树的增量更新关联挖掘算法[J]. J4, 2016, 38(5): 1039-1045.
[12]	丁建立，王曼. 一种面向航班准点率保障的航班协同调度评价模型[J]. J4, 2016, 38(4): 686-692.
[13]	李景霞1，吴国栋1，钱俊彦2. 基于邻接矩阵的Web服务组合[J]. J4, 2015, 37(9): 1627-1631.
[14]	赵志刚,万军,王芳. 一种基于向量的概率加权关联规则挖掘算法[J]. J4, 2014, 36(2): 354-358.
[15]	白秋产1,金春霞2, 章慧2,周海岩2. 词共现文本主题聚类算法[J]. J4, 2013, 35(7): 164-168.