• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    

一种改进的模糊连接点聚类算法

孙明珊,覃华,苏一丹   

  1. (广西大学计算机与电子信息学院,广西 南宁 530004)
  • 收稿日期:2017-02-13 修回日期:2017-03-31 出版日期:2018-06-25 发布日期:2018-06-25
  • 基金资助:

    国家自然科学基金(61363027)

An improved fuzzy joint points clustering algorithm

SUN Mingshan,QIN Hua,SU Yidan   

  1. (School of Computer and Electronics Information,Guangxi University,Nanning 530004,China)
  • Received:2017-02-13 Revised:2017-03-31 Online:2018-06-25 Published:2018-06-25

摘要:

传统的模糊连接点FJP聚类算法采用基于欧氏距离的最大最小合成运算法生成传递闭包,该方法所生成的传递闭包存在失真问题,即包含有较多错误的数据关联信息,最终造成算法聚类精度低且计算时间长。针对以上问题,提出一种改进的模糊连接点聚类算法:先用组合核函数计算数据集的模糊相似度矩阵,提高算法对数据非线性特征的辨识能力,并用大顶堆存储之;然后遍历传递闭包矩阵中的空元素,用堆顶的桥元素填充传递闭包的空元素,直至生成传递闭包。在测试数据集上的实验结果表明,本文算法的平均聚类精度较传统FJP算法有20%以上的提升,显著改善了传递闭包的失真问题;另外,在大型数据集上的计算效率亦优于传统FJP算法的,说明本文改进FJP算法的思路是有效的、可行的。
 
 

关键词: 模糊连接点聚类算法, 传递闭包;桥元素, 大顶堆

Abstract:

Conventional fuzzy joint points clustering method calculates the transitive closure by the maxmin composition based on Euclidean distance, which makes the transitive closure anamorphic and include a lot of false data association information, and  low clustering precision and high time complexity. Aiming at these problems, we propose an improved fuzzy joint points clustering algorithm. We first use the combined kernel function to obtain the fuzzy similarity matrix, which can increase the nonlinear recognizing ability. Then we use the maxheap to store fuzzy similarity matrix, traverse the empty elements of the transitive closure matrix, and populate the empty elements by the bridging element. Experimental results on the UCI data sets and artificial data sets show that the proposed approach has a shorter time and 20% higher clustering accuracy compared with the conventional FJP. In addition, the computational efficiency on large data sets is also more excellent than the traditional FJP clustering algorithm, which verifies that the idea of the improved FJP clustering algorithm is effective and feasible.
 

Key words: fuzzy joint points clustering algorithm, transitive closure, bridging element, maxheap