QAmeleon: Multilingual QA with Only 5 Examples

P Agrawal, C Alberti, F Huot, J Maynez, J Ma… - Transactions of the …, 2023 - direct.mit.edu
Transactions of the Association for Computational Linguistics, 2023direct.mit.edu
The availability of large, high-quality datasets has been a major driver of recent progress in
question answering (QA). Such annotated datasets, however, are difficult and costly to
collect, and rarely exist in languages other than English, rendering QA technology
inaccessible to underrepresented languages. An alternative to building large monolingual
training datasets is to leverage pre-trained language models (PLMs) under a few-shot
learning setting. Our approach, QAmeleon, uses a PLM to automatically generate …
Abstract
The availability of large, high-quality datasets has been a major driver of recent progress in question answering (QA). Such annotated datasets, however, are difficult and costly to collect, and rarely exist in languages other than English, rendering QA technology inaccessible to underrepresented languages. An alternative to building large monolingual training datasets is to leverage pre-trained language models (PLMs) under a few-shot learning setting. Our approach, QAmeleon, uses a PLM to automatically generate multilingual data upon which QA models are fine-tuned, thus avoiding costly annotation. Prompt tuning the PLM with only five examples per language delivers accuracy superior to translation-based baselines; it bridges nearly 60% of the gap between an English-only baseline and a fully-supervised upper bound fine-tuned on almost 50,000 hand-labeled examples; and consistently leads to improvements compared to directly fine-tuning a QA model on labeled examples in low resource settings. Experiments on the TyDiqa-GoldP and MLQA benchmarks show that few-shot prompt tuning for data synthesis scales across languages and is a viable alternative to large-scale annotation.
MIT Press
顯示最佳搜尋結果。 查看所有結果