恶意软件知识图谱的构建与研究

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (01): 86-94.

• 计算机网络与信息安全 • 上一篇下一篇

恶意软件知识图谱的构建与研究

罗养霞,李浩,武晨明

（西安财经大学信息学院,陕西西安 710100）

收稿日期:2024-07-08 修回日期:2024-08-27 接受日期:2025-01-25 出版日期:2025-01-25 发布日期:2025-01-18
基金资助:
国家自然科学基金(62372373; 61972314);陕西省重点研发计划（2024GX-YBXM-545）；西安财经大学2023年研究生创新基金(23YC033);2024年国家级大学生创新训练计划（202411560029）

Construction and research of malware knowledge graph

LUO Yangxia,LI Hao,WU Chenming

（School of Information,Xi’an University of Finance and Economics,Xi’an 710100,China)

Received:2024-07-08 Revised:2024-08-27 Accepted:2025-01-25 Online:2025-01-25 Published:2025-01-18

摘要/Abstract

摘要： 近年来,知识图谱在恶意软件分析领域应用广泛,但是多数研究人员着重于构建恶意软件API知识图谱,利用知识图谱去检测恶意代码,而利用API知识图谱解释性较弱、专业性较高。针对上述问题,提出通过NER模型去抽取恶意软件名称、发现地等文本实体信息,以此构建恶意软件知识图谱,并通过知识图谱发现其多样性、演化路径、威胁方式与分类关联等。首先研究了恶意软件知识图谱的构建方法,完成数据预处理、模式层构建与数据层构建。其次对恶意软件结构化与半结构化数据进行实体标识与规范化,完成本体构建（实体、关联与附加属性）,通过模式层指导数据层的方法,利于BERT-BiLSTM-CRF模型进行知识抽取。最后,利用Neo4j图数据库对知识图谱进行存储与可视化。利用病毒库数据对所建模型进行仿真验证,实验结果表明：此模型相比同类模型效果更好,性能指标更优异,对推进网络安全知识简易化和防御体系知识普及化具有重要意义。

关键词: 知识图谱, 恶意软件, 知识抽取

Abstract: In recent years, knowledge graphs have been widely applied in the field of malware analysis, but most scholars have focused on constructing malware API knowledge graphs and using them to detect malicious code. However, the interpretability of API knowledge graphs is relatively weak, and they require a high level of expertise. To address these issues, this paper proposes using a named entity recognition (NER) model to extract text entity information such as malware names and discovery locations, thereby constructing a malware knowledge graph. This graph is then used to discover the diversity, evolution paths, threat methods, and classification associations of malware. Firstly, this paper studies the construction method of a malware knowledge graph, completing data preprocessing, schema layer construction, and data layer construction. Secondly, it identifies and standardizes entities in structured and semi-structured malware data to complete ontology construction (entities, relationships, and additional attributes). Guided by the schema layer, the data layer uses the BERT-BiLSTM-CRF model for knowledge extraction. Finally, the Neo4j graph database is utilized for storing and visualizing the knowledge graph. Simultaneously, the proposed model is validated through simulations using virus database data. Experimental results show that this model outperforms similar models in terms of effectiveness and performance indicators, and it is of great significance for simplifying cybersecurity knowledge and promoting the popularization of defense system knowledge.

Key words: knowledge graph, malware, knowledge extraction

罗养霞, 李浩, 武晨明. 恶意软件知识图谱的构建与研究[J]. 计算机工程与科学, 2025, 47(01): 86-94.

LUO Yangxia, LI Hao, WU Chenming. Construction and research of malware knowledge graph[J]. Computer Engineering & Science, 2025, 47(01): 86-94.

编辑推荐

Metrics

阅读次数

全文

138

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	138

来源	本网站	其他网站

次数	93	45
比例	67%	33%

摘要

最新录用	在线预览	正式出版

0	0	91

	来源	本网站

	次数	91
	比例	100%

[1]	印杰, 黄肖宇, 刘家银, 牛博威, 谢文伟, . 基于预训练语言模型的安卓恶意软件检测方法[J]. 计算机工程与科学, 2023, 45(08): 1433-1442.
[2]	张若一, 金柳, 马慧芳, 王亦可, 李清风. 融合相似用户影响效应的知识图谱推荐模型[J]. 计算机工程与科学, 2023, 45(03): 520-527.
[3]	马赫, 王海荣, 周北京, 孙崇, 徐玺. 基于表示学习的实体对齐方法综述[J]. 计算机工程与科学, 2023, 45(03): 554-564.
[4]	杨雄1,2,查志琴1,朱宇光1,徐则中1,2. 基于能量有限型无线传感网的恶意软件攻防优化策略[J]. J4, 2011, 33(5): 22-26.

恶意软件知识图谱的构建与研究

Construction and research of malware knowledge graph

PDF

可视化

摘要/Abstract

引用本文

使用本文

相关文章 4

编辑推荐

Metrics

本文评价