Relation Extraction

This is the problem of detecting and classifying relationships between named entities. The set of relationship types is determined by a pre-defined ontology. There are few ontologies used for evaluation such as the popular ontology Automatic Content Extraction (ACE) program. The ACE focuses on binary relations and has set of major relation types (and their subtypes). ACE also has a distinction between relation extraction and relation mention extraction whereby the former refers to the semantic relation between pair of entities and the latter refers to identifying individual mentions of entity relations. This is different from NER systems where they primarily focus on entity mentions and leave out entity clustering to coreference and entity resolution.

Relation extraction is a difficult problem as there usually exists many different sets of relations not to mention the ambiguity problem in information extraction. Therefore, relation extraction systems tend to not perform to the same level as NER systems. Successful relation extraction requires the detection of argument mentions, with entity types chaining these mentions to the ontology. There are several challenges here:

  1. Relation extraction heavily depends on domain, the language, and NER

  2. Ambiguity is extremely high

  3. The complexity from binary relations to higher arity relation increases exponentially

  4. Relation extraction requires large datasets

Feature engineering is important in relation extraction similar to NER and word embeddings play a big development in the task. A recent exciting development has been to perform joint modelling and extraction of relations and entities.



Data Scientist

Leave a Reply