OpenKE in PyTorch

  • 11 KE models in PyTorch
    • See above – the only one missing is HolE cause I haven’t read the paper yet!
  • 3 types of loss functions
    • MarginLoss
    • SigmoidLoss
    • SoftplusLoss
  • 1 Negative Sampling class
  • TrainDataLoader and TestDataLoader
  • Trainer and Tester
  • 8 datasets
    • Each dataset has 6 main files:
      • train2id —> format: (e1, e2, r)
      • entity2id
      • relation2id
      • valid2id
      • test2id
      • type_constraint —> for each relation, identify which head entities and tail entities is it related to

How does all the files fit together?

  1. Load training data using TrainDataLoader
  2. Load testing data using TestDataLoader
  3. Define the KE model
  4. Define loss function using Negative Sampling class and what type of loss
  5. Train the model using Trainer + save the model
  6. Load the model + Test the model using Tester

TrainDataLoader

Has TrainDataLoader and TrainDataSampler class. TrainDataLoader takes in the different inputs and return the TrainDataSampler, which returns data in batches using __next__ method.

What’s the dataflow when Trainer starts?

  1. Trainer takes in a model (wrapped with Negative Sampling), TrainDataLoader, and other parameters
  2. Trainer.run()
    • Initialise optimiser
    • Starting training loop (no. epochs)
      • Compute loss using self.train_one_step(data)
        • Pass data into Negative Sampling class and within the class, pass the data into the KE model using self.model(data) (self here belongs to Negative Sampling)
          • Using KE model forward method, you compute the score using the head, tail, and relation embeddings
          • This score is pass back to the Negative Sampling class
        • The score created by the KE model is split into positive and negative score and we use these two scores to compute our loss using self.loss which uses our specified loss type class (e.g. MarginLoss)
        • Once the loss is computed, if you need to add regularisation, here’s where you will do it before returning the loss back to the Trainer
      • Loss is printed out within each epoch
      • Save model at the end of epoch

Evaluation Metrics

  1. Mean Reciprocal Rank (MRR) —> more robust than mean rank
    • Raw
      • Major weakness
        • The MRR metric does not evaluate the rest of the list of recommended items. It focuses on a single item from the list.
          • If the correct test triplet is ranked second on the list
            • If model A ranks a correct triple (from train val set) to be first
            • If model B ranks an incorrect triple to be first
            • Both model A and B will get the same score since the correct test triplet is ranked second on the list
        • It gives a list with a single relevant item just a much weight as a list with many relevant items. It is fine if that is the target of the evaluation.
          • If you detected the correct triple in a list of irrelevant items and this correct triple is ranked 10th (for example), does that make this a good link prediction model?
    • Filtered (only calculate rank among triples where entities does not appear in train, valid, or test
      • Because some corrupted triplets may be in the training set and validation set. In this case, those corrupted triplets may be ranked above the test triplet, but this should not be counted as an error because both triplets are true.
  2. Hit@K —> K is usually 1, 3, and 10
  3. Mean Rank

Models

DistMult

  • Basic idea of creating embeddings for entities and relations to perform inference on knowledge bases. Loss function is minimising margin-based ranking objective
  • score = (h * r) * t
    self.dim = dim
    self.margin = margin
    self.epsilon = epsilon
    self.ent_embeddings = nn.Embedding(self.ent_tot, self.dim)
    self.rel_embeddings = nn.Embedding(self.rel_tot, self.dim)
    
    nn.init.xavier_uniform_(self.ent_embeddings.weight.data)
    nn.init.xavier_uniform_(self.rel_embeddings.weight.data)
    
    def _calc(self, h, t, r, mode):
    	if mode != 'normal':
    		h = h.view(-1, r.shape[0], h.shape[-1])
    		t = t.view(-1, r.shape[0], t.shape[-1])
    		r = r.view(-1, r.shape[0], r.shape[-1])
    	if mode == 'head_batch':
    		score = h * (r * t)
    	else:
    		score = (h * r) * t
    	score = torch.sum(score, -1).flatten()
    	return score
    
    def forward(self, data):
    	batch_h = data['batch_h']
    	batch_t = data['batch_t']
    	batch_r = data['batch_r']
    	mode = data['mode']
    	h = self.ent_embeddings(batch_h)
    	t = self.ent_embeddings(batch_t)
    	r = self.rel_embeddings(batch_r)
    	score = self._calc(h ,t, r, mode)
    	return score
    

RESCAL

  • A three-way tensor, where the first two modes represent the entities and the third mode represents relations
    self.dim = dim
    self.ent_embeddings = nn.Embedding(self.ent_tot, self.dim)
    self.rel_matrices = nn.Embedding(self.rel_tot, self.dim * self.dim)
    
    nn.init.xavier_uniform_(self.ent_embeddings.weight.data)
    nn.init.xavier_uniform_(self.rel_matrices.weight.data)
    
    def _calc(self, h, t, r):
    	t = t.view(-1, self.dim, 1)
    	r = r.view(-1, self.dim, self.dim)
    	tr = torch.matmul(r, t)
    	tr = tr.view(-1, self.dim)
    	return -torch.sum(h * tr, -1)
    
    def forward(self, data):
    	batch_h = data['batch_h']
    	batch_t = data['batch_t']
    	batch_r = data['batch_r']
    	h = self.ent_embeddings(batch_h)
    	t = self.ent_embeddings(batch_t)
    	r = self.rel_matrices(batch_r)
    	score = self._calc(h ,t, r)
    	return score
    

ComplEx

  • Map entities and relation embeddings into complex vector space!

Initialisation

# All real and imaginary entities and relation embeddings have the same dimensions
self.ent_re_embeddings = nn.Embedding(self.ent_tot, self.dim)
self.ent_im_embeddings = nn.Embedding(self.ent_tot, self.dim)
self.rel_re_embeddings = nn.Embedding(self.rel_tot, self.dim)
self.rel_im_embeddings = nn.Embedding(self.rel_tot, self.dim)

# They are all initialisae using xaiver uniform
nn.init.xavier_uniform_(self.ent_re_embeddings.weight.data)
nn.init.xavier_uniform_(self.ent_im_embeddings.weight.data)
nn.init.xavier_uniform_(self.rel_re_embeddings.weight.data)
nn.init.xavier_uniform_(self.rel_im_embeddings.weight.data)

Scoring function

def _calc(self, h_re, h_im, t_re, t_im, r_re, r_im):
      return torch.sum(
          h_re * t_re * r_re
          + h_im * t_im * r_re
          + h_re * t_im * r_im
          - h_im * t_re * r_im,
          -1
      )

Forward function (pretty straightforward)

def forward(self, data):
      batch_h = data['batch_h']
      batch_t = data['batch_t']
      batch_r = data['batch_r']
      h_re = self.ent_re_embeddings(batch_h)
      h_im = self.ent_im_embeddings(batch_h)
      t_re = self.ent_re_embeddings(batch_t)
      t_im = self.ent_im_embeddings(batch_t)
      r_re = self.rel_re_embeddings(batch_r)
      r_im = self.rel_im_embeddings(batch_r)
      score = self._calc(h_re, h_im, t_re, t_im, r_re, r_im)
      return score

Analogy

  • Optimise latent representations using analogical properties of the embed entities and relations
  • It’s a unified framework that encapsulates DistMult, ComplEx, and HolE —> man to king Analogy the master model

Initialisation

self.ent_re_embeddings = nn.Embedding(self.ent_tot, self.dim)
self.ent_im_embeddings = nn.Embedding(self.ent_tot, self.dim)
self.rel_re_embeddings = nn.Embedding(self.rel_tot, self.dim)
self.rel_im_embeddings = nn.Embedding(self.rel_tot, self.dim)
self.ent_embeddings = nn.Embedding(self.ent_tot, self.dim * 2)
self.rel_embeddings = nn.Embedding(self.rel_tot, self.dim * 2)

nn.init.xavier_uniform_(self.ent_re_embeddings.weight.data)
nn.init.xavier_uniform_(self.ent_im_embeddings.weight.data)
nn.init.xavier_uniform_(self.rel_re_embeddings.weight.data)
nn.init.xavier_uniform_(self.rel_im_embeddings.weight.data)
nn.init.xavier_uniform_(self.ent_embeddings.weight.data)
nn.init.xavier_uniform_(self.rel_embeddings.weight.data)

Score function

def _calc(self, h_re, h_im, h, t_re, t_im, t, r_re, r_im, r):
		return (-torch.sum(r_re * h_re * t_re +
						   r_re * h_im * t_im +
						   r_im * h_re * t_im -
						   r_im * h_im * t_re, -1)
				-torch.sum(h * t * r, -1))

def forward(self, data):
		batch_h = data['batch_h']
		batch_t = data['batch_t']
		batch_r = data['batch_r']
		h_re = self.ent_re_embeddings(batch_h)
		h_im = self.ent_im_embeddings(batch_h)
		h = self.ent_embeddings(batch_h)
		t_re = self.ent_re_embeddings(batch_t)
		t_im = self.ent_im_embeddings(batch_t)
		t = self.ent_embeddings(batch_t)
		r_re = self.rel_re_embeddings(batch_r)
		r_im = self.rel_im_embeddings(batch_r)
		r = self.rel_embeddings(batch_r)
		score = self._calc(h_re, h_im, h, t_re, t_im, t, r_re, r_im, r)
		return score

SimplE

  • Two vectors of each entity, one when the entity head and the other when the entity is tail
  • Two vectors for each relation, one is the normal relation vector and the other is the inverse relation vector
  • The similarity function is the average of CP scores of triple and respective inverse triple
  • SimpleE-ignr —> During training, for each correct / incorrect triple, SimpleE-ignr will update the embeddings such that each of the two scores of triple and respective inverse triple become larger / smaller. During testing, SimpleE-ignr ignores the inverse relations!
    self.dim = dim
    self.ent_embeddings = nn.Embedding(self.ent_tot, self.dim)
    self.rel_embeddings = nn.Embedding(self.rel_tot, self.dim)
    self.rel_inv_embeddings = nn.Embedding(self.rel_tot, self.dim)
    
    nn.init.xavier_uniform_(self.ent_embeddings.weight.data)
    nn.init.xavier_uniform_(self.rel_embeddings.weight.data)
    nn.init.xavier_uniform_(self.rel_inv_embeddings.weight.data)
    
    # the similarity function
    def _calc_avg(self, h, t, r, r_inv):
        return (torch.sum(h * r * t, -1) + torch.sum(h * r_inv * t, -1))/2
    
    # SimpleE-ignr
    def _calc_ingr(self, h, r, t):
        return torch.sum(h * r * t, -1)
    
    def forward(self, data):
        batch_h = data['batch_h']
        batch_t = data['batch_t']
        batch_r = data['batch_r']
        h = self.ent_embeddings(batch_h)
        t = self.ent_embeddings(batch_t)
        r = self.rel_embeddings(batch_r)
        r_inv = self.rel_inv_embeddings(batch_r)
        score = self._calc_avg(h, t, r, r_inv)
        return score
    

TransE

  • Xavier uniform initialisation
  • Normalisation
  • score = (h + r) - t
    def forward(self, data):
    		batch_h = data['batch_h']
    		batch_t = data['batch_t']
    		batch_r = data['batch_r']
    		mode = data['mode']
    		h = self.ent_embeddings(batch_h)
    		t = self.ent_embeddings(batch_t)
    		r = self.rel_embeddings(batch_r)
    		score = self._calc(h ,t, r, mode)
    		if self.margin_flag:
    			return self.margin - score
    		else:
    			return score
    

TransH

  • Map the head and tail embeddings to the norm vector first (relation-specific hyperplane) and then do score = (h_norm + r) - t_norm
    # Transfer embeddings into norm vector (hyperplane)
    def _transfer(self, e, norm):
    		norm = F.normalize(norm, p = 2, dim = -1)
    		if e.shape[0] != norm.shape[0]:
    			e = e.view(-1, norm.shape[0], e.shape[-1])
    			norm = norm.view(-1, norm.shape[0], norm.shape[-1])
    			e = e - torch.sum(e * norm, -1, True) * norm
    			return e.view(-1, e.shape[-1])
    		else:
    			return e - torch.sum(e * norm, -1, True) * norm
    
    	def forward(self, data):
    		batch_h = data['batch_h']
    		batch_t = data['batch_t']
    		batch_r = data['batch_r']
    		mode = data['mode']
    		h = self.ent_embeddings(batch_h)
    		t = self.ent_embeddings(batch_t)
    		r = self.rel_embeddings(batch_r)
    		r_norm = self.norm_vector(batch_r)
    		h = self._transfer(h, r_norm)
    		t = self._transfer(t, r_norm)
    		score = self._calc(h ,t, r, mode)
    		if self.margin_flag:
    			return self.margin - score
    		else:
    			return score
    

TransR

  • Map entities and relation embeddings in two distinct spaces: entity space and relation spaces and performs translation in relation space! This requires the model to use project matrix to transform entity embeddings to relation space first before translating it.
  • The score = h_relation + r - tail_relation
    self.dim_e = dim_e # entity dimension
    self.dim_r = dim_r # relation dimension
    self.norm_flag = norm_flag
    self.p_norm = p_norm
    self.rand_init = rand_init
    
    # entity and relation dimension doesn't have to be the same
    self.ent_embeddings = nn.Embedding(self.ent_tot, self.dim_e)
    self.rel_embeddings = nn.Embedding(self.rel_tot, self.dim_r)
    
    nn.init.xavier_uniform_(self.ent_embeddings.weight.data)
    nn.init.xavier_uniform_(self.rel_embeddings.weight.data)
    
    # either random initialisation or xavier uniform
    self.transfer_matrix = nn.Embedding(self.rel_tot, self.dim_e * self.dim_r)
    
    def _transfer(self, e, r_transfer):
    		r_transfer = r_transfer.view(-1, self.dim_e, self.dim_r)
    		if e.shape[0] != r_transfer.shape[0]:
    			e = e.view(-1, r_transfer.shape[0], self.dim_e).permute(1, 0, 2)
    			e = torch.matmul(e, r_transfer).permute(1, 0, 2)
    		else:
    			e = e.view(-1, 1, self.dim_e)
    			e = torch.matmul(e, r_transfer)
    		return e.view(-1, self.dim_r)
    
    	def forward(self, data):
    		batch_h = data['batch_h']
    		batch_t = data['batch_t']
    		batch_r = data['batch_r']
    		mode = data['mode']
    		h = self.ent_embeddings(batch_h)
    		t = self.ent_embeddings(batch_t)
    		r = self.rel_embeddings(batch_r)
    		r_transfer = self.transfer_matrix(batch_r)
    		h = self._transfer(h, r_transfer)
    		t = self._transfer(t, r_transfer)
    		score = self._calc(h ,t, r, mode)
    		if self.margin_flag:
    			return self.margin - score
    		else:
    			return score
    

TransD

  • Each entity and relation has two vectors: one that represents the meaning of entity and the other is used to construct the mapping matrices
    • h, h_p, r, r_p, t, t_p
      • h_p, r_p, and t_p are used to construct the mapping matrices!
    • The mapping matrices are used to translate entities embeddings to the relation space and use r to do the translation after!
  • score = (h_transformed + r) - t_transformed
  • TransE, TransH, and TransR are related to TransD
    • TransE is a special case of TransD where dimension of entities (m) = dimension of relations (n) and all projection vectors are 0
    • TransH is related to TransD when m = n. The only difference between TransD and TransH is that project vectors are determined only by relations in TransH
    • TransR defines a mapping matrix for each relation whereas TransD improved on this and dynamically construct two mapping matrices for each triple by setting projection vector for each entity and relation
    self.dim_e = dim_e
    self.dim_r = dim_r
    
    self.ent_embeddings = nn.Embedding(self.ent_tot, self.dim_e)
    self.rel_embeddings = nn.Embedding(self.rel_tot, self.dim_r)
    self.ent_transfer = nn.Embedding(self.ent_tot, self.dim_e)
    self.rel_transfer = nn.Embedding(self.rel_tot, self.dim_r)
    
    def _transfer(self, e, e_transfer, r_transfer):
    		if e.shape[0] != r_transfer.shape[0]:
    			e = e.view(-1, r_transfer.shape[0], e.shape[-1])
    			e_transfer = e_transfer.view(-1, r_transfer.shape[0], e_transfer.shape[-1])
    			r_transfer = r_transfer.view(-1, r_transfer.shape[0], r_transfer.shape[-1])
    			e = F.normalize(
    				self._resize(e, -1, r_transfer.size()[-1]) + torch.sum(e * e_transfer, -1, True) * r_transfer,
    				p = 2, 
    				dim = -1
    			)			
    			return e.view(-1, e.shape[-1])
    		else:
    			return F.normalize(
    				self._resize(e, -1, r_transfer.size()[-1]) + torch.sum(e * e_transfer, -1, True) * r_transfer,
    				p = 2, 
    				dim = -1
    			)
    
    def forward(self, data):
    	batch_h = data['batch_h']
    	batch_t = data['batch_t']
    	batch_r = data['batch_r']
    	mode = data['mode']
    	h = self.ent_embeddings(batch_h)
    	t = self.ent_embeddings(batch_t)
    	r = self.rel_embeddings(batch_r)
    	h_transfer = self.ent_transfer(batch_h)
    	t_transfer = self.ent_transfer(batch_t)
    	r_transfer = self.rel_transfer(batch_r)
    	h = self._transfer(h, h_transfer, r_transfer)
    	t = self._transfer(t, t_transfer, r_transfer)
    	score = self._calc(h ,t, r, mode)
    	if self.margin_flag:
    		return self.margin - score
    	else:
    		return score
    

RotatE

  • Can model and infer different relation patterns such as symmetry/antisymmetry, inversion, and composition —> each relation is a rotation from the source entity to the target entity in the complex vector space
  • Motivated by the Euler’s identity, which shows that a unitary complex number can be seen as a rotation in the complex plane
  • score = (h x r) - t (element-wise multiplication)
    self.margin = margin
    self.epsilon = epsilon
    
    self.dim_e = dim * 2
    self.dim_r = dim
    
    self.ent_embeddings = nn.Embedding(self.ent_tot, self.dim_e)
    self.rel_embeddings = nn.Embedding(self.rel_tot, self.dim_r)
    
    def _calc(self, h, t, r, mode):
    	pi = self.pi_const
    
    	re_head, im_head = torch.chunk(h, 2, dim=-1)
    	re_tail, im_tail = torch.chunk(t, 2, dim=-1)
    
    	phase_relation = r / (self.rel_embedding_range.item() / pi)
    
    	re_relation = torch.cos(phase_relation)
    	im_relation = torch.sin(phase_relation)
    
    	re_head = re_head.view(-1, re_relation.shape[0], re_head.shape[-1]).permute(1, 0, 2)
    	re_tail = re_tail.view(-1, re_relation.shape[0], re_tail.shape[-1]).permute(1, 0, 2)
    	im_head = im_head.view(-1, re_relation.shape[0], im_head.shape[-1]).permute(1, 0, 2)
    	im_tail = im_tail.view(-1, re_relation.shape[0], im_tail.shape[-1]).permute(1, 0, 2)
    	im_relation = im_relation.view(-1, re_relation.shape[0], im_relation.shape[-1]).permute(1, 0, 2)
    	re_relation = re_relation.view(-1, re_relation.shape[0], re_relation.shape[-1]).permute(1, 0, 2)
    
    	if mode == "head_batch":
    		re_score = re_relation * re_tail + im_relation * im_tail
    		im_score = re_relation * im_tail - im_relation * re_tail
    		re_score = re_score - re_head
    		im_score = im_score - im_head
    	else:
    		re_score = re_head * re_relation - im_head * im_relation
    		im_score = re_head * im_relation + im_head * re_relation
    		re_score = re_score - re_tail
    		im_score = im_score - im_tail
    
    	score = torch.stack([re_score, im_score], dim = 0)
    	score = score.norm(dim = 0).sum(dim = -1)
    	return score.permute(1, 0).flatten()
    
    def forward(self, data):
    	batch_h = data['batch_h']
    	batch_t = data['batch_t']
    	batch_r = data['batch_r']
    	mode = data['mode']
    	h = self.ent_embeddings(batch_h)
    	t = self.ent_embeddings(batch_t)
    	r = self.rel_embeddings(batch_r)
    	score = self.margin - self._calc(h ,t, r, mode)
    	return score
    
Ryan

Ryan

Data Scientist

Leave a Reply