• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2014, Vol. 36 ›› Issue (12): 2346-2354.

• 论文 • 上一篇    下一篇

社交网络中一种快速精确的节点影响力排序算法

邹青1,张莹莹2,陈一帆1,张士庚1,段桂华1   

  1. (1.中南大学信息科学与工程学院,湖南 长沙 410083;2.枣庄科技职业学院电气工程系,山东 滕州 277500)
  • 收稿日期:2014-09-04 修回日期:2014-11-07 出版日期:2014-12-25 发布日期:2014-12-25
  • 基金资助:

    国家自然科学基金资助项目(61103203);湖南省战略性新兴产业重大科技攻关计划资助项目(2012GK4054)

A fast and accurate node influence sorting
algorithm in online social networks          

ZOU Qing1,ZHANG Yingying2,CHEN  Yifan1,ZHANG Shigeng1,DUAN Guihua1   

  1. (1.School of Information Science and Engineering,Center South University,Changsha 410083;
    2.Department of Electronic Engineering,Zaozhuang Vocational College of Scienc & Technology,Tengzhou 277500,China)
  • Received:2014-09-04 Revised:2014-11-07 Online:2014-12-25 Published:2014-12-25

摘要:

在大规模在线社交网络中,通过对用户影响力进行排序找出其中最具影响力的节点(集合)是一个很重要的研究方向,对于有效控制信息扩散、舆情分析和控制、精准营销等均有重要的作用。已有的节点影响力排序算法或者需要网络的全局拓扑信息来计算单个节点影响力(如基于介数中心性的算法)而时间开销过大,不适用于大规模网络;或者基于传统的网页排序算法(如PageRank)而不能很好地处理社交网络中存在着大量“末梢”节点的问题以及不同用户之间的联系强度不同的问题。在传统的PageRank算法的基础上做出了两点改进。首先,通过在PageRank算法的权值回收步骤中考虑对不同的连接赋予不同的权值,有效避免了末梢节点带来的影响。其次,在PageRank算法的投票过程中考虑邻居个体的差异性,提出了一种基于半邻域信息的节点权值分配方法,有效提高了节点排序的准确度。在一个包含大约15 000个用户的样本网络中,我们所提出的改进算法能够找出前1 000个最有影响力的节点中的40%以上的节点,而传统的PageRank算法仅能找出其中11%的节点。同时,相比于基于介数中心性的算法,所提出的改进算法以小得多的时间开销达到了相近甚至更好的排序准确度。

关键词: 在线社交网络, 影响力排序算法, 影响力评价, PageRank改进

Abstract:

In large Online Social Networks (OSNs), sorting nodes according to their influence is an important research issue. The most influential (set of) nodes can be found by sorting nodes, which is essential to the control of information dissemination, public opinion control and analyses, and onpurpose advertising. Existing influence sorting algorithms either need global topology information of the network to calculate the influence of individual nodes (e.g., the betweennessbased algorithms), which are usually time consuming and thus are not applicable to large scale networks, or use the traditional sorting algorithms designed for web page ranking (e.g., PageRank), which cannot well handle the properties of OSNs like the existence of end nodes and different relationship between different nodes. The traditional PageRank algorithm is enhanced from two aspects to make it applicable to node sorting in large scale OSNs. Firstly, different residual weights are assigned to nodes according to the weights of links in the weight collection phase of PageRank, which effectively mitigates the negative influence of end nodes on the sorting accuracy. Secondly, in the voting process, the diversity among different nodes is considered and a neighborhood based algorithm is proposed to assign weights to different nodes, which effectively improve the sorting accuracy. The performance of the proposed algorithm is evaluated on a sample network constructed with 15,000 real users sampled from Sina Weibo. It is shown that the enhanced algorithm can find more than 40% of the 1000 most influential nodes in the sample network, while the traditional PageRank algorithm counterpart can find only 11%. Meanwhile, compared with the betweennessbased algorithms, the proposed algorithm achieves similar or even better sorting accuracy with much less time cost.

Key words: online social networks;influence node sorting;influence evaluation;PageRank improvement