• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (5): 931-939.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

Research on Chinese—traditional Mongolian cross-lingual summarization methods in low-resource scenarios

BAN Qi1,2,YUN Jing1,2,DENG Lei1,2   

  1. (1.College of Data Science and Application(College of Cyber Security),
    Inner Mongolia University of Technology,Hohhot 010080;
    2.Inner Mongolia Autonomous Region Engineering & Technology Research Center of
     Big Data based Software Service,Hohhot 010080,China)
  • Received:2024-08-15 Revised:2024-08-29 Online:2025-05-25 Published:2025-05-27

Abstract: The cross-langual summarization aims to generating a summary in the target language (such as traditional Mongolian) given a source document in one language (such as Chinese).Typically,traditional multi-task frameworks employ sequence-to-sequence networks,which apply multiple decoders,each dedicated to a specific task.However,when documentation is translated from one language into another,the above structures cannot effectively capture and understand the relationships and differences between the two languages due to the different morphological and structural characteristics of both languages.This is particularly evident in the case of traditional Mongolian,where its complex morphological changes and diverse word formation patterns make the learning and processing of language features under low-resource conditions challenging.To address this challenge,we propose a cross-lingual summarization model that embeds consistency learning into a multi-task framework.Model consistency by calculating the distance metric of the probability distribution difference between the source language summary and the generated target language summary.Subsequently,the cross-lingual summarization model is optimized under the constraints of both cross-entropy loss and consistency loss.Furthermore,we built a Chinese—Mongolian cross-lingual summarization dataset.The competitive ROUGE scores obtained on this dataset demonstrate the effectiveness of the proposed model in resource-poor conditions.


Key words: Chinese—Mongolian cross-lingual summarization, consistency learning, low-resource