• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (05): 834-844.

• 计算机网络与信息安全 • 上一篇    下一篇

基于染色随机游走的可重叠社区发现

昌阳1,马慧芳1,2,3   

  1. (1.西北师范大学计算机科学与工程学院,甘肃 兰州 730070;
    2.广西师范大学广西多源信息挖掘与安全重点实验室,广西 桂林 541001;
    3.桂林电子科技大学广西可信软件重点实验室,广西 桂林 541004)

  • 收稿日期:2020-09-21 修回日期:2021-11-12 接受日期:2022-05-25 出版日期:2022-05-25 发布日期:2022-05-24
  • 基金资助:
    国家自然科学基金(61762078,61363058);西北师范大学青年教师能力提升计划(NWNU-LKQN2019-2);西北师范大学研究生科研资助项目(2019KYZZ012073)

Overlapping community detection based on colored random walk

CHANG Yang1,MA Hui-fang1,2,3   

  1. (1.College of Computer Science and Engineering,Northwest Normal University,Lanzhou 730070;
    2.Guangxi Key Laboratory of Multi-source Information Mining and Security,Guangxi Normal University,Guilin 541001;
    3.Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin 541004,China)
  • Received:2020-09-21 Revised:2021-11-12 Accepted:2022-05-25 Online:2022-05-25 Published:2022-05-24

摘要: 社区发现是一个基础性的且被广泛研究的问题。现有的社区发现方法大多聚焦于网络拓扑结构,然而随着真实网络中实体可用属性的激增,捕获图中结构和属性的丰富交互关系来进行社区发现变得尤为必要。据此面向属性图提出了一种基于染色随机游走的可重叠社区发现算法OCDC,该算法解决了传统的基于随机游走的社区发现算法利用结构转移矩阵造成社区发现效果不佳的问题。具体地,首先利用经典的初始种子策略选出网络中差异度较大的节点,在此基础上设计种子替换策略,挖掘网络中质量更佳的种子替换路径集合对初始种子集合进行替换;其次构建结构-属性交互节点转移矩阵并执行染色随机游走过程得到高质量种子节点的染色分布向量;最后基于融合结构和属性的并行电导值对社区进行扩展。在人工网络和现实网络上的实验表明,本文提出的算法能够准确地识别属性社区并显著优于基准算法。

关键词: 社区发现, 种子, 染色随机游走, 并行电导

Abstract: Community detection is a fundamental and widely-studied problem. Most existing community detection methods focus on network topology. However, with the proliferation of rich information available for entities in real-world networks, it is indispensable to capture the rich interaction between the structure and attributes in the graph for community detection. This paper proposes an Overlapping Community Detection method based on Colored random walk (OCDC). The algorithm is able to conquer the limitation of random walk-based community detection methods that directly utilize the original network topology. Specifically,  initial seed nodes in the network is firstly selected. Secondly, a seed replacement strategy is developed to capture a better-quality seed replacement path set. Thirdly, the structure-attribute interaction node transition matrix is generated to perform the colored random walk in order to obtain the colored distribution vector. Finally, based on the combination of structure and attri- bute, the parallel conductance is captured to expand the community. Experiments on synthetic networks and real-world network show that our proposed algorithm can accurately identify attributed communities and significantly outperform the state-of-the-art methods.


Key words: community detection, seed, colored random walk, parallel conductance