Charles Explorer logo
🇬🇧

A Context-aware Natural Language Generation Dataset for Dialogue Systems

Publication at Faculty of Mathematics and Physics |
2016

Abstract

We present a novel dataset for natural language generation (NLG) in spoken dialogue systems which includes preceding context (user utterance) along with each system response to be generated, i.e., each pair of source meaning representation and target natural language paraphrase. We expect this to allow an NLG system to adapt (entrain) to the user’s way of speaking, thus creating more natural and potentially more successful responses.

The dataset has been collected using crowdsourcing, with several stages to obtain natural user utterances and corresponding relevant, natural, and contextually bound system responses. The dataset is available for download under the Creative Commons 4.0 BY-SA license.