This is an example of manual feature engineering where we hand-crafted patterns. Certain literal string can act as a “trigger” for certain relation type. For example, if “a native of” string appears in “Starbuck, a native of Nantucket”, whereby “Starbuck” is an entity of type Person and “Nantucket” is an entity of type location, we can derive the relation entity-origin between “Starbuck” and “Nantucket”.

We can use lemmatisation to generalise these literal string so that different strings that have the same meaning would trigger the same pattern. We could also take it a step further and use WordNet synset, a dictionary of synonyms, to group all the words that are synonyms to each other.

Relation extraction patterns can be represented in a finite-state automata. If the NER is also a finite-state machine, then the systems can be combined by finite-state transduction. This allows us to alleviate ambiguity by propagating the uncertainty through the finite-state cascade and disambiguate from higher-level context. For example, the entity recogniser might not be able to decide whether “Washington” is a Person or Location. In the composed transducer, the relation extractor would be free to select Person annotation whenever “Washington” appears in certain context that matches the relevant pattern.



Data Scientist

Leave a Reply