• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (10): 1983-1988.

• 论文 • Previous Articles    

An algorithm of identify spam pages
based on link similarity and spam rates 

LU Zhao1,2,LI Shijun2   

  1. (1.School of Computer Science and Engineering,Yulin Normal University,Yulin 573000;
    2.School of Computer,Wuhan University,Wuhan 430079,China)
  • Received:2015-07-24 Revised:2015-09-16 Online:2015-10-25 Published:2015-10-25

Abstract:

Spam pages seek to boost their ranking positions and thus earn profit mainly through spam links. Based on the analytical features of spam links, we introduce link similarity and spam rate as two indexes for spam page judgment. Inspired by the Bad Rank algorithm, we calculate link similarity and Spam rate by iteration from the seed set of spam pages, set the weights in accordance with the relationship of link pointing from the seed set of spam pages, and measure the pages to be judged. AntiTrust Rank and other relevant approaches are adopted to make comparison with the traditional comparative methods. Experimental results prove the advantage of our approach over the traditional methods.

Key words: spam page;spam link;link similarity;spam rate;weight coefficient