• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 • 上一篇    下一篇

个性化跨语言信息检索中结果重排序研究

周栋,赵文玉,伍璇,刘建勋   

  1. (湖南科技大学计算机科学与工程学院,湖南 湘潭 411201)
  • 收稿日期:2016-01-15 修回日期:2016-03-16 出版日期:2017-10-25 发布日期:2017-10-25
  • 基金资助:
    国家自然科学基金(61300129,61572187);湖南省教育厅资助科研项目(16K030);湖南省自然科学基金
    (2017JJ2101);湖南省研究生科研创新资助项目(CX2016B575,CX2017B650)

Result re-ranking in personalized
cross-language information retrieval

ZHOU Dong,ZHAO Wen-yu,WU Xuan,LIU Jian-xun   

  1. (School of Computer Science and Engineering,Hunan University of Science and Technology,Xiangtan 411201,China)
  • Received:2016-01-15 Revised:2016-03-16 Online:2017-10-25 Published:2017-10-25

摘要:

目前,Web的不断发展使得针对其内容搜索的精确度有所降低,尤其在不同的语言中进行搜索时,情况变得愈发复杂。跨语言
信息检索提供了一种跨越语言障碍、获取信息的有效方法。以往的跨语言信息检索研究大多采取以检索系统为中心的研究方
法,并未考虑到用户在翻译和检索过程中发挥的作用。
结果重排序技术已经广泛应用于单语个性化信息检索,但是在个性化跨语言信息检索中还较少有相关研究。通过结果重排序
技术来研究个性化跨语言信息检索,提出了两种个性化跨语言结果重排序方法。一种基于隐含语义,而另外一种则基于外部
语义进行,将首轮搜索结果根据用户的偏好进一步进行处理和优化,使用户感兴趣的内容置于搜索结果列表的前列。在真实
用户搜索日志数据上的实验结果表明,结果重排序能够有效提高个性化跨语言信息检索的搜索准确率。
 
 

关键词: 结果重排序, 个性化, 跨语言信息检索, 隐含语义, 外部语义

Abstract:

The continuing development of the Web has led to further inaccuracy when searching across the contents. The situation is even worse when these searches are performed across different lan-guages. Cross-language information retrieval (CLIR) provides an effective way to access information re-gardless of the language in which it is authored. CLIR research has favored system-centered approaches in the past. The user is not an integral part of the translation and retrieval processes. We investigate the problem of personalized cross-language information retrieval by exploiting the results re-ranking tech-nique. The technique has been thoroughly studied in monolingual personalized information retrieval. However, the performance of results re-ranking used in personalized cross-language information retrieval is still unclear. We propose two result reranking methods for personalized cross-language information retrieval. One is based on latent semantics while another is based on external semantics. The relevant results obtained from the first retrieval round are optimized, and then we re-rank the highly relevant documents in terms of users' interests to the top of the result list. Experiments are conducted on a large real search log, and the results show that results reranking can effectively improve the search precision of personalized cross-language information retrieval.

Key words: result re-ranking, personalization, cross-language information retrieval, latent semantics, external semantics