• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

基于路径的频繁子图挖掘算法研究

唐德权,张波云   

  1. (湖南警察学院信息技术系,湖南 长沙410138)
  • 收稿日期:2018-08-13 修回日期:2019-05-21 出版日期:2019-12-25 发布日期:2019-12-25
  • 基金资助:

    国家自然科学基金(61471169);2017湖南省科技计划重点研发项目(2017NK2402);2017年湖南省科技重大专项(2017SK1040)

A path-based frequent subgraph mining algorithm

TANG De-quan,ZHANG Bo-yun   

  1. (Department of Information Technology,Hunan Police Academy,Changsha 410138,China)
  • Received:2018-08-13 Revised:2019-05-21 Online:2019-12-25 Published:2019-12-25

摘要:

图挖掘是数据挖掘的一个重要研究方向,而图挖掘主要集中在图数据集内频繁子图的挖掘。频繁子图挖掘技术的关键是建立有效机制减少冗余候选子图,以便高效计算和处理所需的频繁子图。提出了一种基于路径的频繁子图挖掘算法,该算法首先找出所有频繁边从而挖掘出频繁单路径,然后通过组合、双射和操作扩展出较多的频繁路径,再通过连接操作产生所有频繁子图候选集。通过定理证明了该算法的正确性和完整性,从理论上分析了该算法时间复杂度低于现有的算法,最后进行了2个图数据集实验,在候选集产生的数量和时间性能2方面验证了算法的优越性。
 
 

关键词: 图挖掘, 频繁子图, 候选子图, 频繁路径, 时间性能

Abstract:

Graph mining is an important research area in data mining , while graph mining mainly focuses on frequent subgraph mining in graph datasets. The key step in the research of frequent subgraph mining techniques is to establish an effective mechanism to reduce the generation of redundant candidate subgraphs in order to efficiently calculate and process the required frequent subgraphs. A path-based frequent subgraph mining algorithm is proposed. The algorithm first finds all frequent edges to mine frequent single paths, then expands large frequent paths through combination operation and bijection sum operation, and then uses the connection operator to generate all frequent subgraph candidate sets. The correctness and completeness of the algorithm are proved by theorem. Theoretical analysis shows that the algorithm has lower time complexity than the existing algorithm. Finally, experiments are carried out on two graph datasets, and the results verify that the algorithm is superior in terms of quality and time performance when generating candidate sets.
 

Key words: graph mining, frequent subgraph, candidate subgraph;frequent paths;time performance