• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (09): 1686-1692.

• 人工智能与数据挖掘 • 上一篇    下一篇

基于模糊贝叶斯决策的核心概念抽取方法

钟寒1,2,徐艺嘉1,鹿浩1,孙旌睿1   

  1. (1.中国人民公安大学信息网络安全学院,北京 102623;2.安全防范与风险评估公安部重点实验室,北京 102623)
  • 收稿日期:2021-10-29 修回日期:2022-03-16 接受日期:2022-09-25 出版日期:2022-09-25 发布日期:2022-09-25
  • 基金资助:
    国家社会科学基金(20AZD114);公安部科技强警基础工作专项(2019GABJC01);公安部软科学理论研究计划(2021LL39);中央高校基本科研业务费项目(2021JKF107)

A core concept extracting method based on fuzzy Bayesian decision-making

ZHONG Han1,2,XU Yi-jia1,LU Hao1,SUN Jing-rui1   

  1. (1.College of Information and Network Safety,People’s Public Security University of China,Beijing 102623;
    2.Key Laboratory of Safety Precautions and Risk Assessment,Ministry of Public Security,Beijing 102623,China)
  • Received:2021-10-29 Revised:2022-03-16 Accepted:2022-09-25 Online:2022-09-25 Published:2022-09-25

摘要: 为了提高特定领域核心概念抽取的效率,提出一种基于模糊贝叶斯决策的核心概念抽取方法。在特定领域内随机抽取大量文本并进行分词获取候选概念;然后采用TF-IDF算法计算候选概念的各项特征值,采用概念隶属度归一化处理候选概念特征值;最终通过贝叶斯决策计算候选概念为核心概念的概率。在财经领域相关数据集上进行文本核心概念抽取的实验结果表明,所提方法的F1值相比TextRank、LDA主题模型、word2vec词聚类模型、RNN、LSTM等的F1值有所提高。综合实验结果表明,基于模糊贝叶斯决策的核心概念抽取方法在核心概念抽取方面表现较好。

关键词: 概念抽取;概念隶属度;贝叶斯决策 ,

Abstract: In order to improve the efficiency of concept extraction in the field, a core concept extraction method based on fuzzy Bayesian decision-making is proposed. Firstly, after randomly extracting a large amount of text and sorting the text vocabulary, candidate concepts are obtained. Secondly, the characteristic values of the candidate concepts are calculated by the traditional TF-IDF algorithm, and normalized by the conceptual membership. Finally, the probability that the candidate concepts are the core concepts is calculated by Bayesian decision-making. The extraction experiment of the core concept of financial text shows that the average accuracy of core concept extraction is much higher than that of the traditional TextRank, LDA, word2vec, RNN and LSTM. Comprehensive experimental results show that the core concept extraction method based on fuzzy Bayesian decision-making performs better in core concept extraction. 

Key words: concept extraction, conceptual membership, Bayesian decision-making