Classical relation extraction involves defining a set of relations schema in advance and so the relation for any pair of entities can be predicted using multi-class classification. In open information extraction (OpenIE), a relation can be any triple of text. For example, the relation tuple (mayor of, Maynard Jackson, Atlanta). OpenIE systems are trained using distant supervision or bootstrapping instead of labelled sentences as the task is generally evaluated on the relation level instead of on the sentence level.

TEXTRUNNER system (Banko et al. 2007) is an early work example that incorporates linguistic features. It works as follows:

  1. Identify relations with a set of handcrafted syntactic rules

  2. The identified relations are used to train a classification model that uses POS patterns as features

  3. Aggregate the relations extracted, remove redundant relations and compute the number of times each relation is mentioned in the corpus



Data Scientist

Leave a Reply