Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (01): 170-178.
• Artificial Intelligence and Data Mining • Previous Articles Next Articles
FAN Lin-yu,LI Jun-hui,KONG Fang
Received:
Revised:
Accepted:
Online:
Published:
Abstract: AMR (Abstract Meaning Representation) abstracts the semantic features of a given text into a single-root directed acyclic graph. Due to the lack of non-English language AMR datasets, cross-lingual AMR parsing aims to parse non-English text into the corresponding AMR graph of its English translation. Current cross-lingual AMR parsing methods rely on large-scale English-target language parallel corpora or high-performance English-target language translation models to build (English, target language, AMR) triplet parallel corpora for target language AMR parsing. In contrast to this assumption, this paper explores the possibility of achieving cross-lingual AMR parsing with only large-scale monolingual English and target language corpora. To this end, we propose cross-lingual AMR parsing based on unsupervised pretraining. Specifically, during pretraining, we integrate unsupervised neural machine translation tasks, English AMR parsing tasks, and target language AMR parsing tasks. During fine-tuning, we use an English AMR2.0-based target language AMR dataset for single-task fine-tuning. Experimental results on AMR2.0 and a multilingual AMR test set show that our method achieves Smatch F1 scores of 67.89, 68.04, and 67.99 in German, Spanish, and Italian, respectively.
Key words: cross-lingual AMR parsing, seq2seq model, pre-trained model
FAN Lin-yu, LI Jun-hui, KONG Fang. Cross-lingual AMR parsing based on unsupervised pre-training[J]. Computer Engineering & Science, 2024, 46(01): 170-178.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2024/V46/I01/170