My journey continues, learning about link prediction through my readings.
There are two different inputs for link prediction models: Subject Relation (SR) or Subject Relation Object (SRO). The SR link prediction model takes subject entity and relation as input and output prediction scores for all tail entity in the entity set. This predictions are compared to the expected scores (computed using the N-to-All sampling) to compute the loss and training. Dynamic link prediction models are rare and the pipelines to train them are highly model specific. Current link prediction pipelines only support date property in dynamic models and not other per-triple properties.
Both static and dynamic models have common pre-processing steps. For example, both models are required to map entities and relations into vector spaces. This is currently done manually by the user, which means that we don’t have a consistent set of entities and relations for every experiment. Currently, there are no standardisation in the link prediction pipeline. Facebook BigGraph is currently one of the link prediction frameworks but it’s adapted to extremely large graphs and provides no support for facts with temporal dimension or other fact properties.
What should a link prediction framework has?
This should be similar to other ML frameworks, which includes: encoding, training, prediction, saving, and loading. The encoding should allow users to use SRO triplet data format without any further processing steps. It should also supports dynamic facts and weighted triplets. Rank-based and filtering should be available during evaluation steps.
Link Prediction and 1.8 – Knowledge Bases
Link prediction differs widely between different knowledge bases, which means the structure of knowledge graphs heavily affect the performance.