• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 • 上一篇    

基于局部密度自适应度量的粗糙K-means聚类算法

马福民1,逯瑞强1,张腾飞2   

  1. (1.南京财经大学信息工程学院,江苏 南京 210023;2.南京邮电大学自动化学院,江苏 南京 210023)
  • 收稿日期:2016-08-16 修回日期:2016-10-17 出版日期:2018-01-25 发布日期:2018-01-25
  • 基金资助:

    国家自然科学基金(61403184,61105082);江苏省高校自然科学研究重大项目(17KJA120001);江苏省“青蓝工程”基金(QL2016);南京邮电大学科研项目(NY215149);江苏高校优势学科建设工程资助项目(PAPD)

Rough K-means clustering based
on local density adaptive measure

MA Fu-min1,LU Rui-qiang1,ZHANG Teng-fei2   

  1. (1.College of Information Engineering,Nanjing University of Finance and Economics, Nanjing 210023;
    2.College of Automation,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
     
  • Received:2016-08-16 Revised:2016-10-17 Online:2018-01-25 Published:2018-01-25

摘要:

通过引入上、下近似的思想,粗糙K-means已成为一种处理聚类边界模糊问题的有效算法,粗糙模糊K-means、模糊粗糙K-means等作为粗糙K-means的衍生算法,进一步对聚类边界对象的不确定性进行了细化描述,改善了聚类的效果。然而,这些算法在中心均值迭代计算时没有充分考虑各簇的数据对象与均值中心的距离、邻近范围的数据分布疏密程度等因素对聚类精度的影响。针对这一问题提出了一种局部密度自适应度量的方法来描述簇内数据对象的空间特征,给出了一种基于局部密度自适应度量的粗糙K-means聚类算法,并通过实例计算分析验证了算法的有效性。

关键词: 粗糙聚类, K-means, 局部密度度量, 粗糙集

Abstract:

By introducing the idea of lower and upper approximations, rough K-means has become a powerful algorithm for clustering analysis with overlapping clusters. Its derivative algorithms such as rough fuzzy K-means and fuzzy rough K-means describe the uncertain objects located in the boundaries in detail, thus improving the clustering effect. However, these algorithms do not fully consider the influence of the factors, such as the distance between the data centers of the clusters and the average center and the density of the data distributed in the neighborhood, on the clustering accuracy. Aiming at this problem, a local density adaptive measure method is proposed to describe the spatial characteristics of data objects in a cluster. A rough K-means clustering algorithm based on local density adaptive measure is given. Comparative experimental results of real world data from UCI demonstrate the validity of the proposed algorithm.

Key words: rough clustering, K-means, local density measure, rough sets