Other Methods

Reinforcement Learning

  • There’s a disconnect between optimisation objectives and evaluation metrics
    • Answers that exactly match the ground truth but not located at the labelled positions would be ignored by models

    • Evaluation metrics such as exact match (EM) and F1 are not differentiable

  • Introducing reinforcement learning to the training process
    • Xiong et al. use F1 score as the reward function and treat maximum likelihood estimation and reinforcement learning as a multi-task learning problem

  • Can be used to determine when to stop the interaction process
    • Usually, when people are answering a question, they will stop reading the context if they believe an adequate answer has been formed. The termination state is highly related to the complexity of the context and question

    • The termination state is discrete and so we can’t use backpropagation during training, which is why reinforcement learning is applied to train model with termination state

Answer Ranker

  • The objective is to verify the correctness of the predicted answer

  • The common process of answer ranker is it will extract some candidate answers and the one with the highest score is the correct answer
    • EpiReader extract answer candidates similar to the AS Reader, selecting answer spans with the highest attention sum score.

    • EpiReader then feeds those answer candidates to the reasoner component, which inserts the answers to the question sequence at the blank location and computes the probability of the answer being correct. The highest probability answer is selected

  • This answer ranker has inspired some researchers to detect unanswerable questions

Sentence Selector

  • If the MRC model is given a long context document, it will take a lot of time to go over the whole context to find the answers. If relevant sentences (to the questions) can be found in advance, this should speed up the training process

  • Min et al. propose a sentence selector to find the minimum set of sentences needed to answer a question
    • The sentence selector is a seq2seq, with the encoder computing the sentence and question representations and the decoder calculating the similarity score for each sentence, between the sentence and the question

    • If the score is higher than a predefined threshold, the sentence will be fed into the MRC systems. This means that the number of selected sentences changes based on the question

  • Overall, MRC system with a sentence selector can reduce training and inference time with similar or better performance



Data Scientist

Leave a Reply