22.8 Entity Linking

What is entity linking?

Entity linking is the task of associating a mention with a real-world entity in an ontology (list of entities in the world). The most common ontology for this task is the Wikipedia, where each Wikipedia page represents a particular entity.

How to perform entity linking?

In early systems, entity linking is performed in two stages:

  1. Mention detection

  2. Mention disambiguation

A useful feature for mention detection is key phrase. In Wikipedia, this would be the mapping between the anchor texts and the Wikipedia title. Mention detection often include query expansions. Mention disambiguation is often performed using supervised learning.

How can coreference help entity linking?

It helps by giving more possible mentions to help link to the right Wikipedia page. Entity linking can also be used to help improve coreference resolution. Entity linking can help disambiguate mentions in text. Recent research has been focusing on training model to learn jointly the entity linking and coreference resolution.

22.9 Winograd Schema problems

What is the Winograd Schema?

It is a coreference problem designed to be easily disambiguated by the human reader but not solvable by simple computational techniques. The dataset includes pair of statements that differ in a single word or phrase and a coreference question. The structure is as follows:

  1. The problems have two entities

  2. A pronoun is included to refer to one of the entities but it could also grammatically refer to the other

  3. A coreference question asks which entity the pronoun refers to

  4. If one word in the question changed, the human-preferred answer would change too

Although the Winograd Schema is challenging, it was able to be solved using pre-trained language models, fine-tuned on Winograd Schema sentences. This is due to large pre-trained language models encode a large amount of the world and common-sense knowledge. The success in tackling Winograd Schema has led to the development of harder Winograd-like dataset such as KNOWREF.



