• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2016, Vol. 38 ›› Issue (07): 1338-1343.

• 论文 • 上一篇    下一篇

基于局部关键节点的大数据聚类算法

曹阳,钱晓东   

  1. (兰州交通大学自动化与电气工程学院,甘肃 兰州 730070)
  • 收稿日期:2015-07-01 修回日期:2015-09-11 出版日期:2016-07-25 发布日期:2016-07-25
  • 基金资助:

    基于复杂网络的商务大数据聚类与关联应用研究资助项目(71461017)

A big data clustering algorithm based on local key nodes  

CAO Yang,QIAN Xiaodong   

  1. (School of Automation and Electrical Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China)
  • Received:2015-07-01 Revised:2015-09-11 Online:2016-07-25 Published:2016-07-25

摘要:

为了能在大数据集中合理地寻找到网络结构,提出了一种适用于大数据集的通过局部核心节点进行社区发现的算法。对于初始节点不确定和适应度函数计算所带来的时间消耗,引入局部关键节点和对适应度公式进行改进来减少时间消耗。并在小规模数据网络和较大规模数据网络中与经典算法进行实验,由实验结果得出,在小规模的数据集中,本算法与经典算法效率相差不大,在测试数据集的规模不断变大的情况下,本算法执行效率明显提高。测试结果表明,本算法是可行和有效的,适用于大规模数据的网络结构发现。

关键词: 大数据, 聚类, 局部, 适应度

Abstract:

In order to find a reasonable network structure in big data, we present a local search algorithm suitable for big data. Aiming at the uncertainty of the initial nodes and the timeconsuming fitness function computation, we introduce key local nodes and improve the fitness function to reduce the time consumption. Experimental results show that compared with classical algorithms the time consumption of the improved algorithm does not change much in smallscale data networks but is less in largescale data networks, which demonstrates that the proposed algorithm is feasible and effective and can be applied to the clustering of large-scale data.

Key words: big data;clustering;local;fitness function