Phrase-indexed question answering: A new challenge for scalable document comprehension

M Seo, T Kwiatkowski, AP Parikh, A Farhadi… - arXiv preprint arXiv …, 2018 - arxiv.org
arXiv preprint arXiv:1804.07726, 2018arxiv.org
We formalize a new modular variant of current question answering tasks by enforcing
complete independence of the document encoder from the question encoder. This
formulation addresses a key challenge in machine comprehension by requiring a
standalone representation of the document discourse. It additionally leads to a significant
scalability advantage since the encoding of the answer candidate phrases in the document
can be pre-computed and indexed offline for efficient retrieval. We experiment with baseline …
We formalize a new modular variant of current question answering tasks by enforcing complete independence of the document encoder from the question encoder. This formulation addresses a key challenge in machine comprehension by requiring a standalone representation of the document discourse. It additionally leads to a significant scalability advantage since the encoding of the answer candidate phrases in the document can be pre-computed and indexed offline for efficient retrieval. We experiment with baseline models for the new task, which achieve a reasonable accuracy but significantly underperform unconstrained QA models. We invite the QA research community to engage in Phrase-Indexed Question Answering (PIQA, pika) for closing the gap. The leaderboard is at: nlp.cs.washington.edu/piqa
arxiv.org