26.4 The Dialogue-State Architecture
26.4.4 Dialogue Policy
What is a dialogue policy?
It’s the policy that decides what actions the system should take next. In other words, at each turn of the conversation we want to predict which action the system should take. You can user simple policy that just condition on the current dialogue state or you could use advanced models such as reinforcement learning to train the policy.
What are the two methods to correctly interpret the user’s inputs?
Confirmation – make sure the system understanding the user correctly
Rejection – reject any utterances that the system is likely to have misunderstanding
An explicit confirmation involves the system asking a direct question to confirm the system’s understanding, for example, “do you mean X?”. An implicit confirmation is where the system would demonstrate its understanding by asking the next question (grounding strategy).
Rejection occurs whenever the users are saying things that the system is unable to follow or capture. Therefore, whenever an utterance is rejected, systems will have predefined prompting or escalations. Another alternative is rapid reprompting. Examples of this is “I’m sorry?” or “What was that?”.
26.4.5 Natural language generation in the dialogue-state model
What are the two stages of NLG in the dialogue-state model?
Content planning (what to say)
Sentence realisation (how to say)
The dialogue policy is responsible for content planning. The figure below showcase examples of the sentence realisation phase. Once the content planner has chosen the dialogue act with respective slots and fillers, the goal of the sentence realisation phase is to generate a sentence that includes the slots and fillers. This is possible by training a model with training pairs of representation and sentence pairs.
What is delexicalisation?
The process of replacing specific slot words in the training set with generic placeholder slot tokens. This is shown in the figure below. We do this to increase our training data as it’s very difficult to build a sizeable training data for sentence realisation.
The mappings of frames to delexicalised sentences is done by encoder decoder models. The input to the encoder is token sequence that represent the dialogue act and its arguments. The encoder then produces the context vector to be feed into the lexical decoder, which it’s responsible for generating English sentences. This is shown in the figure below.