• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2007, Vol. 29 ›› Issue (5): 86-89.

• 论文 • 上一篇    下一篇

一种实现偏序约束条件下的序列模式挖掘算法

任家东 宗俊省 李志国   

  • 出版日期:2007-05-01 发布日期:2010-06-02

  • Online:2007-05-01 Published:2010-06-02

摘要:

在序列模式挖掘应用中,约束是非常重要的。本文提出了一种新的约束一偏序约束,允许事务之间的间隔可以是无穷大。但是,本文间隔约束中事务之间的间隔只能是整数,所以可以把偏序约束看成是间隔约束的扩展。针对这个问题,提出了一种新颖的算法SPM(Sequential Pattern Maintenance,简称SPM)算法来解决偏序约束,采用含蓄分割  割技术把不满足偏序约束的数据序列分割出去,充分利用已挖掘出来的信息来解决由于数据序列数目变小使得支持度值变小的复杂情况。实验表明,SPM算法能够快速可扩展
地挖掘出所有满足约束的频繁序列模式。

关键词: 数据挖掘 约束序列模式挖掘 偏序约束 含蓄分割

Abstract:

Constraints are essential for many sequential pattern mining apphcations. Ibis paper presents a new constraint called the partial order constraint, wh  ich allows the time duration between transactions to be infinite. But the duration can only be integer, so the partial order constraint can be considere  d to extend the duration constraint. An original algorithm called SPM(sequential pattern maintenance)is proposed. The SPM algorithm adopts an implicit  t segmentation technique which segments the dissatisfied constraint sequences from the existing ones, and makes full use of the information obtained fro m the previous mining processes to solve the case that the count of the support in DB becomes low because of the reduction of data sequences. The experi mental results show that our approach is fast and scalable.

Key words: (data mining, constraint sequential pattern mining, partial-order constraint, implicit segmentation)