• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (1): 155-159.

• 论文 • 上一篇    下一篇

基于累积平均密度的聚类方法

胡博磊,谭建豪   

  1. (湖南大学电气与信息工程学院,湖南 长沙 410082)
  • 收稿日期:2012-01-10 修回日期:2012-03-14 出版日期:2013-01-25 发布日期:2013-01-25
  • 作者简介:胡博磊(1986),男 ,河南信阳人,硕士生,研究方向为数据挖掘、模式识别和智能信息处理。

Clustering algorithm based on cumulative average density

HU Bolei,TAN Jianhao   

  1. (College of Electrical and Information Engineering,Hunan University,Changsha 410082,China)
  • Received:2012-01-10 Revised:2012-03-14 Online:2013-01-25 Published:2013-01-25

摘要:

针对DBSCAN算法存在的参数敏感性和不能区分相连的不同密度的簇等缺陷,提出了一种基于DBSCAN算法的改进算法。算法提出了累积平均密度的概念,用来作为簇合并的依据,弱化了密度阈值Minpts的作用;选取密度最大的对象作为初始聚类中心,按照密度由高到低的顺序进行聚类,具有一定的层次性,因此支持变密度数据集聚类。最后,用数据集对算法进行了聚类实验。实验结果表明,改进算法具有一定的参数鲁棒性,对于相连的不同密度的簇,能够达到理想的聚类效果。

关键词: 聚类算法, 相连的簇, 累积平均密度, 容纳因子

Abstract:

There exist two defects in the DBSCAN algorithm: input sensitivity, unable to distinguish clusters which have different density and are adjacent to one another. To solve these defects, an improved algorithm based on DBSCAN is proposed. The algorithm uses cumulative average density to determinate whether one cluster can be merged with another or not, has weakened the role of density threshold——Minpts, chooses the object with the maximal density as the beginning center object, does clustering according to the density from high to low, which is hierarchical to a degree, and hence supports clustering datasets with variable density. In the end, datasets are used to do clustering experiments. The results show that the improved algorithm has robustness of parameters to some extent, and can achieve desired effect when clustering dataset with clusters of variable density linked together.

Key words: cluster algorithm;clusters linked together;cumulative average density;accepting factor