• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (09): 1661-1667.

Previous Articles     Next Articles

A Chinese distant supervised personal relationship extraction method based on TongYiCi CiLin and rules

XIE Ming-hong1,2,RAN Qiang1,2,WANG Hong-bin1,2#br# #br#   

  1. (1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500;

    2.Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China)

  • Received:2020-05-11 Revised:2020-07-21 Accepted:2021-09-25 Online:2021-09-25 Published:2021-09-27

Abstract: Distant supervision is a large-scale corpus labeling method based on automatic alignment of entities in the knowledge base, but the excessively strong assumptions lead to a large amount of noise in the acquired corpus. Aiming at this problem, this paper proposes a Chinese distant supervised personal relationship extraction method based on TongYiCi CiLin and rules. The multi-instances learning idea is used to divide the personal relationship into bags. Based on it, TongYiCi CiLin is used to do word frequency statistics on personal relationship trigger words, which can determine the candidate relationship of maximum word frequency and sub-large word frequency. Then, specific personal relationship judgment rules are combined to judge the personal relationship. After judging a personal relationship in a bag, the multi-relationship is further predicted to get the final result of the personal relationship. Expe- rimental results on IPRE, which is a large-scale Chinese distant supervised personal relationship public data set, show that our results have a good F1 value and can identify the personal relationship not marked by the distant supervision data test set.

Key words: TongYiCi CiLin, rules, distant supervision, personal relationship, relationship extraction