• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2006, Vol. 28 ›› Issue (7): 70-72.

• 论文 • 上一篇    下一篇

一种两阶段的限制层次聚类算法

何振峰   

  • 出版日期:2006-07-01 发布日期:2010-05-20

  • Online:2006-07-01 Published:2010-05-20

摘要:

基于数据对象间的关联限制定义了类间关联系数,本文提出了两阶段的限制层次聚类算法TCCL.算法分为两个阶段,第一阶段主要依据数据对象的自然分布,基于数据对象间的距离把它们合并入一个个小类;在第二阶段,依据背景知识,基于类间关联系数来实现小类的进一步合并.一些实际数据集的实验结果表明,TCCL可以比较有效地利用所给关联限制 来改善聚类效果.

关键词: 聚类分析 半监督学习 层次聚类

Abstract:

Based upon the instance-level constraint, the class-level constraint coefficient (CCC) is defined. And a twostage constrained hierarchical algorithm TCCL is presented. During the first stage, different classes will be merged according to data objects' natural distribution. During the next stage, cl lasses will be merged based upon the CCC. Experiments on some real-world datasets demonstrate that TCCL can utilize constraints rather effectively

Key words: clustering analysis, semi-supervised learning, hierarchical clustering