Veronica Latcinnik and Jonathan Berant. 2020. Ex-
plaining question answering models through text
generation. arXiv preprint arXiv:2004.05569.
Rémi Lebret, David Grangier, and Michael Auli. 2016.
Neural text generation from structured data with ap-
plication to the biography domain. arXiv preprint
arXiv:1603.07771.
Nayeon Lee, Andrea Madotto, and Pascale Fung. 2019.
Exploring social bias in chatbots using stereotype
knowledge. In Proceedings of the 2019 Workshop
on Widening NLP, pages 177–180.
Mike Lewis, Yinhan Liu, Naman Goyal, Mar-
jan Ghazvininejad, Abdelrahman Mohamed, Omer
Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019.
Bart: Denoising sequence-to-sequence pre-training
for natural language generation, translation, and
comprehension. arXiv preprint arXiv:1910.13461.
Patrick Lewis, Ethan Perez, Aleksandara Piktus, Fabio
Petroni, Vladimir Karpukhin, Naman Goyal, Hein-
rich Küttler, Mike Lewis, Wen-tau Yih, Tim Rock-
täschel, et al. 2020. Retrieval-augmented generation
for knowledge-intensive nlp tasks. arXiv preprint
arXiv:2005.11401.
Chin-Yew Lin. 2004. Rouge: A package for automatic
evaluation of summaries. In Text summarization
branches out, pages 74–81.
Haochen Liu, Wentao Wang, Yiqi Wang, Hui Liu, Zi-
tao Liu, and Jiliang Tang. 2020. Mitigating gender
bias for neural dialogue generation with adversarial
learning. arXiv preprint arXiv:2009.13028.
Peter J Liu, Mohammad Saleh, Etienne Pot, Ben
Goodrich, Ryan Sepassi, Lukasz Kaiser, and
Noam Shazeer. 2018. Generating wikipedia by
summarizing long sequences.
arXiv preprint
arXiv:1801.10198.
Tianyu Liu, Yizhe Zhang, Chris Brockett, Yi Mao,
Zhifang Sui, Weizhu Chen, and Bill Dolan. 2021.
A token-level reference-free hallucination detection
benchmark for free-form text generation. arXiv
preprint arXiv:2104.08704.
Xiaojiang Liu, Zaiqing Nie, Nenghai Yu, and Ji-Rong
Wen. 2010. Biosnowball: automated population of
wikis. In Proceedings of the 16th ACM SIGKDD in-
ternational conference on Knowledge discovery and
data mining, pages 969–978.
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Man-
dar Joshi, Danqi Chen, Omer Levy, Mike Lewis,
Luke Zettlemoyer, and Veselin Stoyanov. 2019.
Roberta: A robustly optimized bert pretraining ap-
proach. arXiv preprint arXiv:1907.11692.
Wei Luo, Julia Adams, and Hannah Brueckner. 2018.
The ladies vanish?: American sociology and the ge-
nealogy of its missing women on wikipedia. Com-
parative Sociology, 17(5):519–556.
Joshua Maynez, Shashi Narayan, Bernd Bohnet, and
Ryan McDonald. 2020. On faithfulness and factu-
ality in abstractive summarization. arXiv preprint
arXiv:2005.00661.
Nikita Moghe, Siddhartha Arora, Suman Banerjee, and
Mitesh M Khapra. 2018. Towards exploiting back-
ground knowledge for building conversation sys-
tems. arXiv preprint arXiv:1809.08205.
Sharan Narang, Colin Raffel, Katherine Lee, Adam
Roberts, Noah Fiedel, and Karishma Malkan. 2020.
Wt5?! training text-to-text models to explain their
predictions. arXiv preprint arXiv:2004.14546.
Feng Nie, Jin-Ge Yao, Jinpeng Wang, Rong Pan, and
Chin-Yew Lin. 2019. A simple recipe towards re-
ducing hallucination in neural surface realisation. In
Proceedings of the 57th Annual Meeting of the Asso-
ciation for Computational Linguistics, pages 2673–
2679.
Ankur P Parikh, Xuezhi Wang, Sebastian Gehrmann,
Manaal Faruqui, Bhuwan Dhingra, Diyi Yang,
and Dipanjan Das. 2020. Totto: A controlled
table-to-text generation dataset.
arXiv preprint
arXiv:2004.14373.
Ji Ho Park, Jamin Shin, and Pascale Fung. 2018. Re-
ducing gender bias in abusive language detection.
arXiv preprint arXiv:1808.07231.
Stan Peshterliev, Barlas Oguz, Debojeet Chatterjee,
Hakan Inan, and Vikas Bhardwaj. 2021. Conversa-
tional answer generation and factuality for reading
comprehension question-answering. arXiv preprint
arXiv:2103.06500.
Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin,
Dmytro Okhonko, Samuel Broscheit, Gautier Izac-
ard, Patrick Lewis, Barlas O˘guz, Edouard Grave,
Wen-tau Yih, et al. 2021. The web is your oyster–
knowledge-intensive nlp against a very large web
corpus. arXiv preprint arXiv:2112.09924.
Ratish Puduppully, Li Dong, and Mirella Lapata. 2019.
Data-to-text generation with content selection and
planning. In Proceedings of the AAAI conference on
artificial intelligence, volume 33, pages 6908–6915.
Alec Radford, Jeffrey Wu, Rewon Child, David Luan,
Dario Amodei, Ilya Sutskever, et al. 2019. Lan-
guage models are unsupervised multitask learners.
OpenAI blog, 1(8):9.
Christina Joan Sauper and Regina Barzilay. 2009.
Automatically generating wikipedia articles: A
structure-aware approach. Association for Compu-
tational Linguistics.
Katja Geertruida Schmahl, Tom Julian Viering, Stavros
Makrodimitris, Arman Naseri Jahfari, David Tax,
and Marco Loog. 2020. Is wikipedia succeeding in
reducing gender bias? assessing changes in gender