• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (4): 740-750.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

Research on pattern aware sampling algorithm

SHEN Lingzhen,WANG Xin,SHI Junhao,WANG Lu   

  1. (School of Computer Science and Software Engineering,Southwest Petroleum University,Chengdu 610500,China)
  • Received:2024-04-11 Revised:2024-09-13 Online:2025-04-25 Published:2025-04-17

Abstract: With the rapid expansion of graph data scale, traditional analysis techniques struggle to cope with, particularly in frequent pattern mining tasks where traditional algorithms are at risk of computational resource collapse. Graph sampling technology effectively reduces data volume and calculation cost, making it a crucial research direction in graph data analysis. However, existing graph sampling algorithms have limitations in supporting frequent pattern mining tasks. The reason is that these algorithms fail to fully incorporate the key attributes of graph data into structural features, resulting in lower sampling quality.Therefore, this paper proposes a pattern aware sampling (PAS) algorithm that considers the high frequency structure and key attributes of the graph. PAS utilizes neighborhoods (local features) and high frequency single-edge patterns (global features) to weight nodes and edges in the graph, and then  completes the biased walk on the weighted graph for sampling tasks. Experiments demonstrate that compared with other baseline algorithms, PAS achieves superior performance on multiple indicators and can mine top B frequent patterns highly consistent with those in original graph. Under a sampling ratio of merely 0.20, the accuracy reaches up to 94%.

Key words: graph sampling, frequent pattern mining, aggregation, graph attribute