Aurelia Guy, Jacob Menick, Roman Ring, T. W.
Hennigan, Saffron Huang, Lorenzo Maggiore, Chris
Jones, Albin Cassirer, Andy Brock, Michela Pa-
ganini, Geoffrey Irving, Oriol Vinyals, Simon Osin-
dero, Karen Simonyan, Jack W. Rae, Erich Elsen,
and L. Sifre. 2021. Improving language models by
retrieving from trillions of tokens. In ICML.
Tom Brown, Benjamin Mann, Nick Ryder, Melanie
Subbiah, Jared D Kaplan, Prafulla Dhariwal,
Arvind Neelakantan, Pranav Shyam, Girish Sastry,
Amanda Askell, Sandhini Agarwal, Ariel Herbert-
Voss, Gretchen Krueger, Tom Henighan, Rewon
Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu,
Clemens Winter, Chris Hesse, Mark Chen, Eric
Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess,
Jack Clark, Christopher Berner, Sam McCandlish,
Alec Radford, Ilya Sutskever, and Dario Amodei.
Advances in Neural Information Processing Systems,
volume 33, pages 1877–1901. Curran Associates,
Inc.
Danqi Chen, Adam Fisch, Jason Weston, and Antoine
Bordes. 2017. Reading Wikipedia to answer open-
domain questions. In Association for Computa-
tional Linguistics (ACL).
Jingfei Du, Edouard Grave, Beliz Gunel, Vishrav
Chaudhary, Onur Çelebi, Michael Auli, Ves Stoy-
anov, and Alexis Conneau. 2021. Self-training im-
proves pre-training for natural language understand-
ing. In NAACL.
Angela Fan, Claire Gardent, Chloé Braud, and An-
toine Bordes. 2021. Augmenting transformers with
knn-based composite memory for dialog. Transac-
tions of the Association for Computational Linguis-
tics, 9:82–99.
Luyu Gao, Zhuyun Dai, and Jamie Callan. 2021.