• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (3): 145-149.

• 论文 • Previous Articles     Next Articles

Graph classification algorithm based on
divide and conquer strategy and Hash linked list  

SUN Wei1,ZHU Zhengli1,2   

  1. (1.School of Information and Technology,Nanjing Forestry University,Nanjing 210037;2.School of Computer Science and Technology,Nanjing University of Science&Technology,Nanjing 210094,China)
  • Received:2012-05-23 Revised:2012-09-11 Online:2013-03-25 Published:2013-03-25

Abstract:

The mainstream graph data classification algorithms are based on frequent substructure mining strategy, which inevitably leads to searching the global data space repeatedly and hence the related algorithms have low efficiency and cannot meet specific requirements. Aiming at the disadvantages of such algorithms, firstly, the “divide and conquer” strategy is used to design a modular data space and an algorithm that use the hash linked list to store the address and the support degree. Secondly, the original database is partitioned into a limited number of submodules according to the rules, and the gSpan algorithm is used to handle each submodule to get the locally frequent submodel. Thirdly, Hash functions are used to calculate the unique memory address of the mining result of each module, and construct the Hash linked list by recording the support degree. Finally, the globally frequent submodel is obtained and the graph data classifier is built up. The algorithm avoids searching the global space repeatedly so as to greatly enhance the efficiency as well as saves memory overhead by loading the data into the memory at a time. The experiment results show that, compared with the mainstream graph classification algorithms, the new algorithm has 1.2~3.2 times efficiency improvement in the phase of constructing the submodels and has the same classification accuracy rate.

Key words: graph classification;divide and conquer;modular data;Hash linked list;classification efficiency