• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

一种融合社会化标注系统中主题域相似的个性化排序方法

黄进,周栋   

  1. (湖南科技大学计算机科学与工程学院,湖南 湘潭 411201)
  • 收稿日期:2016-09-27 修回日期:2016-12-20 出版日期:2018-05-25 发布日期:2018-05-25
  • 基金资助:

    国家自然科学基金(61300129);湖南省教育厅资助科研项目(16K030);教育部留学回国人员科研启动基金(教外司留[2013]1792)

A personalized ranking method fusing the
similar topic domains in social tagging system
 

HUANG Jin,ZHOU Dong   

  1. (School of Computer Science and Engineering,Hunan University of Science and Technology,Xiangtan 411201,China)
  • Received:2016-09-27 Revised:2016-12-20 Online:2018-05-25 Published:2018-05-25

摘要:

随着网络技术的发展,互联网中越来越多的资源被应用于信息检索中,大量的研究表明,社会化标注可以用于改善信息检索。现有个性化排序的方法中,用户之间的相似度大多通过其共同使用过的标签集来计算。然而,现实中用户标注数据存在稀疏性和标签同义词等问题,导致相似度计算并不准确。在前人研究的基础上,提出了一种融合主题域相似的个性化排序方法。该方法首先通过主题域的划分,将不同主题含义的网页及标签分开,通过构建的标签相似网络找出标签同义词。然后结合用户标签和主题偏好找出兴趣相近的用户,并对用户的标注信息进行扩展,从而能够有效地改善个性化信息检索的效果。在真实数据上的实验结果表明,该方法能有效缓解标注稀疏性和标签同义词问题,有助于改善用户检索体验。
 
 

关键词: 信息检索, 社会化标注, 个性化排序, 主题域偏好

Abstract:

With the development of network technology, more and more resources are applied in information retrieval in the Internet. Numerous studies show that the social annotation can be used to improve search quality. In the existing personalized ranking methods, the similarity between users is usually calculated by their commonly used tag sets. However, in reality, there are some problems such as the sparseness of user annotation data and label synonyms, which makes the similarity calculation inaccurate. Based on the previous researches, this paper proposes a personalized ranking method for fusing the similar topic domains. Firstly, this method separates the webpage and tags with different thematic meanings, and finds the tag synonyms by constructing the network of similar tags. Secondly, this method finds the users of similar interests by combing the user’s tags and the preference of topic domains, and extends the user’s tag information to improve the personalized information retrieval effectively. Experimental results on real data show that this method can effectively alleviate the problems of data sparsity and tag synonyms, and can help to improve the user’s search experience.

Key words: information retrieval, social annotation, personalized ranking, preference of topic domains