The starting point to information extraction is to identify mentions of entities in text! For each text, the goal is to:

  1. Identify the spans (mentions) of entities as well as its named entity types (organisation or location or date) – This is basically named entity recognition (NER)

  2. Link these spans (mentions) to respective entities in a knowledge base. This is known as entity linking and it’s a key step!

Entity linking task

There are many methods to entity linking task. For example, named entity linking only do entity linking for identified named entities whereas wikification do entity linking for all the strings in the text. Example of knowledge bases include YAGO (2007), DBPedia (2007), and Freebase (2008). Entity linking can also be performed in a smaller (closed) settings where a small list of targets is provided in advance. Lastly, the system should also be able to determine if a mention does not refer to any entity in the knowledge base (NIL entity).

There are multiple way to tackle the entity linking task:

  1. Entity linking by learning to rank
  2. Collective entity linking

The problem we face here is that entity mentions could be ambiguous! Therefore, the question lies in how can we alleviate the ambiguity and accurately do entity linking?



Data Scientist

Leave a Reply