基于双向胶囊网络的恶意评论检测

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (10): 1765-1774.

• 计算机网络与信息安全 • 上一篇下一篇

基于双向胶囊网络的恶意评论检测

李公瑾1，邵玉斌1，杜庆治1，龙华1，2，马迪南2

(1.昆明理工大学信息工程与自动化学院，云南昆明 650504；2.云南省媒体融合重点实验室，云南昆明 650032)

收稿日期:2023-07-11 修回日期:2023-10-31 接受日期:2024-10-25 出版日期:2024-10-25 发布日期:2024-10-29
基金资助:
云南省媒体融合重点实验室项目(320225403)

Toxic comments detection based on bidirectional capsule network

LI Gong-jin1，SHAO Yu-bin1,DU Qing-zhi1，LONG Hua1,2，MA Di-nan2

(1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650504；
2.Yunnan Key Laboratory of Media Integration,Kunming 650032,China)

Received:2023-07-11 Revised:2023-10-31 Accepted:2024-10-25 Online:2024-10-25 Published:2024-10-29

摘要/Abstract

摘要： 为了解决现有检测模型无法准确识别语言风格多变、语意隐晦的恶意评论问题，提出了一种基于双向胶囊网络的恶意评论检测模型。首先，利用BERT模型对评论文本进行词嵌入，创建输入矩阵；其次，将输入矩阵传递给双向特征提取层，该层由堆叠的LSTM、双向胶囊网络和注意力网络组成，从正向和反向同时捕获文本的深层语义信息，将生成的正向和反向矩阵拼接起来并输入到注意力机制中，聚焦与恶意评论相关的词语并生成输出向量；再次，拼接输出向量与语境辅助特征向量，丰富特征表示；最后，将拼接向量输入到全连接层中，通过Sigmoid激活函数对评论文本进行分类。在维基百科恶意评论数据集上进行的实验表明，相较于现有研究，基于双向胶囊网络的恶意评论检测模型性能提升显著，能够捕获评论文本中更丰富的语义信息，有效检测恶意评论。

关键词: BERT语言模型, 双向胶囊网络, 语境辅助特征, 恶意评论检测

Abstract: To address the issue that existing detection models struggle to accurately identify malicious comments with varied linguistic styles and implicit semantics, a malicious comment detection model based on a bidirectional capsule network is proposed. Firstly, the BERT model is utilized to perform word embedding on comment texts, creating an input matrix. This input matrix is then passed to a bidirectional feature extraction layer, which comprises stacked LSTM, bidirectional capsule networks, and attention networks. This layer captures the deep semantic information of the text simultaneously from both forward and backward directions. The generated forward and backward matrices are concatenated and input into an attention mechanism, which focuses on words related to malicious comments and generates an output vector. Secondly, the output vector is concatenated with a context-assisted feature vector to enrich the feature representation. Finally, the concatenated vector is input into a fully connected layer, and the comment text is classified through the Sigmoid activation function. Experiments conducted on the Wikipedia malicious comment dataset demonstrate that compared to existing research, the malicious comment detection model based on the bidirectional capsule network achieves significant performance improvements. It is capable of capturing richer semantic information in comment texts and effectively detecting malicious comments.

Key words: BERT language model, bidirectional capsule network, contextual auxiliary features, toxic comments detection

李公瑾, 邵玉斌, 杜庆治, 龙华, 马迪南 . 基于双向胶囊网络的恶意评论检测[J]. 计算机工程与科学, 2024, 46(10): 1765-1774.

LI Gong-jin, SHAO Yu-bin, DU Qing-zhi, LONG Hua, MA Di-nan. Toxic comments detection based on bidirectional capsule network[J]. Computer Engineering & Science, 2024, 46(10): 1765-1774.

编辑推荐

Metrics

阅读次数

全文

285

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	285

来源	本网站	其他网站

次数	197	88
比例	69%	31%

摘要

110

最新录用	在线预览	正式出版

0	0	110

	来源	本网站

	次数	110
	比例	100%

基于双向胶囊网络的恶意评论检测

Toxic comments detection based on bidirectional capsule network

PDF

可视化

摘要/Abstract

引用本文

使用本文

相关文章 0

编辑推荐

Metrics

本文评价