A multi-stage collaborative reasoning framework for legal question answering with large language models

Computer Engineering & Science ›› 2026, Vol. 48 ›› Issue (2): 268-276.

• Artificial Intelligence and Data Mining • Previous Articles Next Articles

A multi-stage collaborative reasoning framework for legal question answering with large language models

FU Qihang，QIN Yongbin，HUANG Ruizhang，ZHOU Yulin，HU Qingqing

(1.Text Computing and Cognitive Intelligence,
Ministry of Education Engineering Research Center,Guizhou University,Guiyang 550025;
2.State Key Laboratory of Public Big Data,Guizhou University,Guiyang 550025;
3.College of Computer Science and Technology,Guizhou University,Guiyang 550025;
4.School of Law and Public Administration,Guizhou Qiannan College of Science and Technology,Huishui 550600,China)

Received:2025-07-20 Revised:2025-09-10 Online:2026-02-25 Published:2026-03-10

Abstract

Abstract: In recent years, large language models (LLMs) have demonstrated broad prospects in the judicial field. However, in knowledge-intensive reasoning and complex logical judgment tasks within judicial question-answering scenarios, challenges such as inadequate reasoning capabilities and imprecise application of legal knowledge persist. To address these issues, this paper proposes a decoupled collaborative reasoning framework (DCRF) that separates “thinking” from “reasoning” in a multi-stage coope- rative process. First, a fine-tuned lightweight “Thinker” generates high-level chains to guide downstream reasoning strategies. Then, an unmodified Qwen1.5-14B-chat “Reasoner”, supported by retrieval-augmented generation and relevant statutory texts, performs fine-grained logical inference. By coordi- nating strategic planning with execution, this framework significantly enhances the model’s flexibility and accuracy in invoking legal knowledge, while avoiding the high costs of fine-tuning large models and reducing overall training overhead. On the JEC-QA and DISC-Law-Eval benchmarks, DCRF achieves an average improvement of 9.77 percentage points in accuracy for single-choice questions and an average increase of 7.48 percentage points in F1-score for multiple-choice questions compared to the base models. Notably, it surpasses DeepSeek-R1-Distill-Qwen-14B in single-choice questions and performs comparably in multiple-choice questions. Experimental results indicate that DCRF effectively strengthens the judicial reasoning capabilities of large language models while reducing training costs.

Key words: multi-stage reasoning, large language models, legal reasoning, retrieval-augmented gene- ration, instruction tuning

FU Qihang, QIN Yongbin, HUANG Ruizhang, ZHOU Yulin, HU Qingqing. A multi-stage collaborative reasoning framework for legal question answering with large language models[J]. Computer Engineering & Science, 2026, 48(2): 268-276.

[1]	LI He, CHI Haoang, LIU Mingyu, YANG Wenjing. A LLMs hallucination mitigation method based on causal relationship [J]. Computer Engineering & Science, 2026, 48(2): 245-255.
[2]	TIAN Yu, LI Junhui, ZHU Suyang, ZHOU Guodong. Data augmentation-based emotion recognition in conversation [J]. Computer Engineering & Science, 2026, 48(2): 330-340.

A multi-stage collaborative reasoning framework for legal question answering with large language models

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 2

Recommended Articles

Metrics

Comments