Google 學術搜尋

QAmeleon: Multilingual QA with Only 5 Examples

P Agrawal, C Alberti, F Huot, J Maynez, J Ma… - Transactions of the …, 2023 - direct.mit.edu

P Agrawal, C Alberti, F Huot, J Maynez, J Ma, S Ruder, K Ganchev, D Das, M Lapata

Transactions of the Association for Computational Linguistics, 2023•direct.mit.edu

Abstract

The availability of large, high-quality datasets has been a major driver of recent progress in question answering (QA). Such annotated datasets, however, are difficult and costly to collect, and rarely exist in languages other than English, rendering QA technology inaccessible to underrepresented languages. An alternative to building large monolingual training datasets is to leverage pre-trained language models (PLMs) under a few-shot learning setting. Our approach, QAmeleon, uses a PLM to automatically generate multilingual data upon which QA models are fine-tuned, thus avoiding costly annotation. Prompt tuning the PLM with only five examples per language delivers accuracy superior to translation-based baselines; it bridges nearly 60% of the gap between an English-only baseline and a fully-supervised upper bound fine-tuned on almost 50,000 hand-labeled examples; and consistently leads to improvements compared to directly fine-tuning a QA model on labeled examples in low resource settings. Experiments on the TyDiqa-GoldP and MLQA benchmarks show that few-shot prompt tuning for data synthesis scales across languages and is a viable alternative to large-scale annotation.

MIT Press

顯示更多顯示較少

儲存引用被引用 22 次相關文章全部共 10 個版本

顯示最佳搜尋結果。查看所有結果

引用

進階搜尋

已儲存至「我的圖書館」

QAmeleon: Multilingual QA with Only 5 Examples