• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

密度峰值快速聚类算法优化研究

王鹏飞1,杨余旺1,柯亚琪2   

  1. (1.南京理工大学计算机科学与工程学院,江苏 南京 210094;
    2.南京农业大学园艺学院,江苏 南京 210095)
  • 收稿日期:2017-03-30 修回日期:2017-05-11 出版日期:2018-08-25 发布日期:2018-08-25
  • 基金资助:

    国家自然科学基金(61640020);江苏省重点研发计划(BE20163681)

Optimization of clustering by fast
search and find of density peaks

WANG Pengfei1,YANG Yuwang1,KE Yaqi2   

  1. (1.School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094;
    2.College of Horticulture,Nanjing Agricultural University,Nanjing 210095,China)
     
  • Received:2017-03-30 Revised:2017-05-11 Online:2018-08-25 Published:2018-08-25

摘要:

密度峰值快速搜索聚类CFSFDP算法选择聚类中心时需要通过人工在决策图中选择,且最后进行簇核心与簇光晕划分时会将簇的一些边缘部分划入簇光晕中,导致划分结果不够合理。针对以上问题,提出一种聚类中心自动选择及簇核心与簇光晕分割优化的聚类算法。利用异常检测的思想,寻找簇中心权值的异常点,将异常点作为各簇的聚类中心;引入簇内局部密度,实现对簇核心与簇光晕更合理的分割。通过实验对比,本文提出的算法自动化效果优于CFSFDP算法且得到的聚类结果更为精确。
 
 

关键词: 聚类, 密度峰值, 异常检测, 簇中心点

Abstract:

The cluster center is chosen manually in the decision graph of the clustering by fast search and find of peaks (CFSFDP) algorithm. And when the cluster is divided into cluster core and cluster halo, some points on the edge of the cluster are divided into the cluster halo group, leading to unreasonable division results. To solve the above problem, we propose a clustering algorithm for automatic selection of cluster center and optimization of the division of cluster core and cluster halo. We adopt the idea of anomaly detection to find the anomaly points of the cluster center. The anomaly points are regarded as the cluster center. And local density in the cluster is introduced for the optimization of the division of cluster core and cluster halo. Comparative experiments show that the proposed algorithm is superior to the CFSFDP algorithm in automation effect and the clustering results are more accurate.
 

Key words: clustering, density peak, anomaly detection, cluster center