• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (01): 170-178.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

Cross-lingual AMR parsing based on unsupervised pre-training

FAN Lin-yu,LI Jun-hui,KONG Fang   

  1. (School of Computer Science & Technology,Soochow University,Suzhou 215006,China)
  • Received:2022-10-21 Revised:2022-12-05 Accepted:2024-01-25 Online:2024-01-25 Published:2024-01-15

Abstract: AMR (Abstract Meaning Representation) abstracts the semantic features of a given text into a single-root directed acyclic graph. Due to the lack of non-English language AMR datasets, cross-lingual AMR parsing aims to parse non-English text into the corresponding AMR graph of its English translation. Current cross-lingual AMR parsing methods rely on large-scale English-target language parallel corpora or high-performance English-target language translation models to build (English, target language, AMR) triplet parallel corpora for target language AMR parsing. In contrast to this assumption, this paper explores the possibility of achieving cross-lingual AMR parsing with only large-scale monolingual English and target language corpora.  To this end, we propose cross-lingual AMR parsing based on unsupervised pretraining. Specifically, during pretraining, we integrate unsupervised neural machine translation tasks, English AMR parsing tasks, and target language AMR parsing tasks. During fine-tuning, we use an English AMR2.0-based target language AMR dataset for single-task fine-tuning. Experimental results on AMR2.0 and a multilingual AMR test set show that our method achieves Smatch F1 scores of 67.89, 68.04, and 67.99 in German, Spanish, and Italian, respectively.


Key words: cross-lingual AMR parsing, seq2seq model, pre-trained model