Encoding Models

This section focuses on encoding the intersections between entities and relations. There are seven encoding models as shown below.

  1. Linear / Bilinear

  2. Factorisation

  3. Neural Networks (NNs)

  4. Convolutional NNs

  5. Recurrent NNs

  6. Transformers

  7. Graph NNs

Linear / Bilinear

This type of encoding model uses linear operation to encode interactions between entities and relations. Examples of such encoding model are:

  • SE

  • SME

  • DistMult

  • ComplEx


  • TransE with L2 regularisation, where scoring function can be expanded to only have linear transformation

Empirically, the ensembles of multiple linear models have shown to improve the predictive capability.


This involves formulating KRL models as three-way tensor X decomposition. The general principle can be denoted as $X_{hrt} ≈ h^TM_rt$ with the composition function following the semantic matching pattern.

Neural Networks (NNs)

Generally, the NNs will take in the entities and relations and compute a semantic matching score. An example of a simple MLP is shown in the figure above.


This type of encoding model is used to capture deep expressive features. ConvE can model semantic information using the non-linear feature learning from multiple conv layers. These non-linear features can be concatenated to increase the learning ability of latent features. ConvKB has been shown to have better experimental performance than ConvE due to its transitional characteristics.


RNNs can capture long-term dependency in knowledge graphs. There are research work that uses RNN to capture the vector representation of the long relation path from one entity to another, with or without entity-level information.


Using transformers can boost results through contextualised representation. We could use contextualised representation to encode edges and path sequences. KG-BERT (uses the idea of language model and pre-training) can be used as an encoder for entities and relations.


GNNS are used to learn the connectivity structure under an encoder-decoder framework. R-GCN uses relation-specific transformation to model the directed nature of knowledge graphs. Other research uses graph attention networks to capture the multi-hop neighbourhood features.



Data Scientist

Leave a Reply