• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2026, Vol. 48 ›› Issue (5): 906-913.doi: 10.3969/j.issn.1007-130X.2026.05.014

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

An unbiased offensive text detection method based on BERT and sentiment analysis

YUAN Liang,GUO Weibin   

  1. (School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
  • Received:2024-07-23 Revised:2024-11-06 Online:2026-05-25 Published:2026-05-21

Abstract: Offensive information on the internet poses severe harm to individuals and society. In offensive text detection methods, existing methods suffer from misjudging non-offensive texts containing profanity and bias against special groups. To address the former issue, this paper proposes a sentiment analysis-based offensive text detection (SAOD) model, which uses sentiment features to assist in predict- ing whether a text is offensive. To tackle the latter issue, a debiasing data augmentation method called special groups mask (SGM) is proposed. This method masks special groups during training, ensuring that special groups are not directly involved in model training, thereby reducing the model's bias towards these groups. Using BERT+LSTM as the base model, experiments were conducted on publicly avail- able datasets ToxiCN and COLD. The experimental results show that the former method improved the base model’s F1-score from 80.18% to 82.67%. Based on this, the latter method reduces the false positive rate (FPR) from 18.27% to 12.77%.

Key words: BERT model, offensive text detection, sentiment analysis, debiasing, data augmentation