目标频繁模式挖掘算法研究
收稿日期: 2010-03-17
修回日期: 2010-06-19
网络出版日期: 2010-09-29
基金资助
广西教育厅项目(200708MS);百色学院重点项目(2007KA03)
Research on the Target Frequent Patterns Mining Algorithms
Received date: 2010-03-17
Revised date: 2010-06-19
Online published: 2010-09-29
通用的频繁模式挖掘算法通常产生庞大的频繁模式集,其中很多是用户不感兴趣的非目标模式。要排除这些非目标模式,用户必须进行“二次挖掘”。TFPgrowth虽然生成所有最大目标频繁模式,但要从中获得目标频繁模式,还需经过“二次挖掘”。若在挖掘的早期就对非目标频繁模式的产生加以限制,则有望提高算法的效率。本文在TFP growth 和SFPgrowth的基础上,提出一种目标频繁模式挖掘算法STFPgrowth,通过对TFP树的排序、根据树根结点的不同情形采用不同的建子树方法和目标频繁模式筛选方法等来提高算法的效率。STFPgrowth挖掘的结果是所有满足用户需求的目标频繁模式,不需“二次挖掘”。实验表明,STFPgrowth的效率高于TFPgrowth,也明显优于Apriori和Eclat。
梁碧珍1,陆月然1,耿立中2,秦亮曦3 . 目标频繁模式挖掘算法研究[J]. 计算机工程与科学, 2010 , 32(10) : 108 -111 . DOI: 10.3969/j.issn.1007130X.2010.
General frequent patterns mining algorithms usually produce large sets of frequent patterns, in which there are many nontarget patterns that users aren’t interested in. To exclude the nontarget patterns , users have to do the second mining. Although TFPgrowth can produce all maximum target frequent patterns , the second minning is still essential to getting the target frequent patterns from them. If we restrict the producing of the nontarget frequent patterns early in the mining process, it would improve the efficiency of the algorithm. Based on the TFPgrowth and the SFPgrowth, a target frequent patterns mining algorithm named STFPgrowth is proposed in this paper,its efficiency can be promoted by sorting TFPtree, adopting different ways to build sub trees and sift target frequent patterns in different cases of tree nodes. STFPgrowth mines all the target frequent patterns which satisfy users’ requirements, and users need not do the second minning . The experiments show that STFPgrowth is more efficient than the TFPgrowth, and outperforms Apriori and Eclat obviously.
/
| 〈 |
|
〉 |