• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2026, Vol. 48 ›› Issue (2): 245-255.

• 人工智能与数据挖掘 • 上一篇    下一篇

一种基于因果关系的减轻大语言模型幻觉的方法

李鹤,迟昊昂,刘明宇,杨文婧


  

  1. (国防科技大学计算机学院,湖南 长沙 410073)

  • 收稿日期:2024-08-24 修回日期:2024-10-10 出版日期:2026-02-25 发布日期:2026-03-10
  • 基金资助:
    国家自然科学基金 (91948303-1,62372459,62376282)

A LLMs hallucination mitigation method based on causal relationship

LI He,CHI Haoang,LIU Mingyu,YANG Wenjing   

  1. (College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)
  • Received:2024-08-24 Revised:2024-10-10 Online:2026-02-25 Published:2026-03-10

摘要: 大语言模型LLMs的出现是生成式人工智能的一个里程碑,其在文本理解和生成任务中取得了显著的成功。尽管LLMs在许多下游任务中取得了巨大的成功,但它们也存在严重的幻觉问题,对LLMs的实际应用是重大的挑战。虽然基于Transformer的LLMs中的自注意力机制是一个重要的模块,但现有文献很少从自注意力机制的角度探讨LLMs的幻觉现象。为填补这一研究空白,从因果关系的角度研究了这个问题。具体而言,提出了一种方法,在不改变LLMs结构的情况下,禁用自注意力层。实验禁用多个开源LLMs中的不同自注意力层,在幻觉评估基准上对这些干预后的LLMs进行了评估,并将其幻觉程度与原始模型进行比较。实验结果表明,禁用LLMs前部或尾部的一些特定自注意力层可以缓解幻觉问题。


关键词: 大语言模型, 大语言模型幻觉, 因果表示学习

Abstract: The emergence of large language models (LLMs) marks a milestone in generative artificial intelligence, achieving remarkable success in text comprehension and generation tasks. Although LLMs have demonstrated tremendous success in numerous downstream tasks, they also suffer from severe hallucination issues, posing significant challenges to their practical applications. While the self-attention mechanism in Transformer-based LLMs is a crucial module, existing literature rarely explores the hallucination phenomenon in LLMs from the perspective of the self-attention mechanism. To fill this research gap, this study investigates the issue from a causal relationship standpoint. Specifically, a method is proposed to disable self-attention layers without altering the structure of the LLMs. Experiments are conducted by disabling different self-attention layers in multiple open-source LLMs, evaluating these intervened LLMs on hallucination assessment benchmarks, and comparing their hallucination levels with the original models. The experimental results indicate that disabling certain specific self-attention layers in the front or tail sections of LLMs can alleviate the hallucination problem.

Key words: large language models (LLMs), hallucinations of large language models, causal representation learning