SMRT chatbots: Improving non-task-oriented dialog with Simulated Multiple Reference Training

H Khayrallah, J Sedoc - arXiv preprint arXiv:2011.00547, 2020 - arxiv.org
arXiv preprint arXiv:2011.00547, 2020arxiv.org
Non-task-oriented dialog models suffer from poor quality and non-diverse responses. To
overcome limited conversational data, we apply Simulated Multiple Reference Training
(SMRT; Khayrallah et al., 2020), and use a paraphraser to simulate multiple responses per
training prompt. We find SMRT improves over a strong Transformer baseline as measured
by human and automatic quality scores and lexical diversity. We also find SMRT is
comparable to pretraining in human evaluation quality, and outperforms pretraining on …
Non-task-oriented dialog models suffer from poor quality and non-diverse responses. To overcome limited conversational data, we apply Simulated Multiple Reference Training (SMRT; Khayrallah et al., 2020), and use a paraphraser to simulate multiple responses per training prompt. We find SMRT improves over a strong Transformer baseline as measured by human and automatic quality scores and lexical diversity. We also find SMRT is comparable to pretraining in human evaluation quality, and outperforms pretraining on automatic quality and lexical diversity, without requiring related-domain dialog data.
arxiv.org