• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

多策略候选集构建与实体链接

杨紫怡,盛晨,孔芳,周国栋   

  1. (苏州大学计算机科学与技术学院,江苏 苏州  215006)
  • 修回日期:2017-11-29 出版日期:2018-12-25 发布日期:2018-12-25
  • 基金资助:

    国家自然科学基金(61472264,61333018,61673290)

Multi-strategy candidate construction and entity linking

YANG Ziyi,SHENG Chen,KONG Fang,ZHOU Guodong   

  1. (School of Computer Science and Technology,Soochow University,Suzhou 215006,China)
  • Revised:2017-11-29 Online:2018-12-25 Published:2018-12-25

摘要:

针对实体链接中候选集构建问题提出了一种多策略结合的候选集构建算法。综合利用多种策略提取上下文中的完整指称,降低候选实体数量,同时提高正确实体的召回率,构建一个高质量的实体候选集。在TAC2014英文语料上使用本文提出的多种策略进行了实验和分析,确定最优候选集构建策略的同时,也证明了本文方法确实能够达到提升候选集召回率和准确率的目的。进一步验证了候选集质量对完整的实体链接系统的性能影响明显。相比基准算法,使用最优候选集构建策略提取的候选集能使整体的实体链接系统的性能提升3.7%。

关键词: 指称扩展, 候选集构建, 实体链接

Abstract:

Aiming at the candidate set construction problem in entity linking systems, we propose a multistrategy candidate set construction method. We use a variety of strategies to extract the complete mentions in the context, reduce the number of candidates and improve the recall of the correct entity in order to construct a high quality candidate set. We conduct experiments on the TAC2014 English entity linking corpus and then analyze the results. The optimal construction strategy is chosen. Experimental results prove that the proposed method can improve the recall and precision of candidate sets. We further validate that the quality of the candidate sets has a significant effect on the performance of the whole entity linking system. Compared with the baseline algorithm, the candidate set extracted by the optimal candidate set construction strategy can improve the performance of the whole entity linking system by 3.7%.

 

 

 

Key words: mention expansion, candidate construction, entity linking