Propagation Step

The propagation and output step is important for producing and outputting the hidden states of nodes and edges. Each variant of GNNs have their own aggregators and updaters to gather information of each node’s neighbours and update nodes’ hidden states.

Convolution Aggregator

Generally, there are two main approaches:

  1. Spectral.Spectral approaches work with the spectral representation of the graphs. Below are the lists of spectral approaches:
    • Spectral Network. What is a Spectral Network?

    • Chebnet

    • GCN

    • AGCN

    • GGP

    The problem with the spectral approaches is that it’s highly dependent on the graph structure, which means that a model trained on a specific structure cannot be directly applied to another graph with a different structure.

  2. Spatial (Non-spectral).Here, we define convolutions directly on the graph, operating on close neighbours. The main challenge here is to define convolution operation that fits well with different sized neighbourhoods, to maintain local invariance characteristics of CNNs. There are different approaches:
    • Neural FPs

    • DCNN (Diffusion CNN)

    • DGCN (Dual GCN)

    • PATCHY-SAN

    • LGCN

    • MoNet

    • GraphSAGE

    • SACNNs

Gate Updater

Use gate mechanism like GRU or LSTM to improve long-term propagation of information across the knowledge graph. There are:

  • GRU – Gated Graph Neural Network

  • LSTM
    • Tree LSTM

    • Graph LSTM

    • Sentence LSTM

Attention Aggregator

Attention mechanism has been one of the contributing factors to many successful breakthroughs in NLP. This led to Graph Attention Network (GAT), which incorporates the attention mechanism into the propagation step. There are two main networks:

  1. Graph Attention Network (GAT)

  2. Gated Attention Network (GAAN) – uses multi-head attention mechanism. It uses self-attention mechanism to gather information from different heads to replace average operation of GAT

Skip Connection

It’s common and tempting to build a deep neural network, where the more layers there are, the more information can be aggregated from each node. However, experiments have shown that deeper models don’t necessarily translate to better performance and sometimes could even perform worse. This is because more layers could also propagate the noisy information! A common solution is to use residual network but this tend to work less effectively with GCNs. There are few solutions for GCNs:

  1. Jump Knowledge Network

  2. Highway GNN

  3. Column Network

Hierarchical Graph

Pooling layer is common in CNNs after the convolution layer. For graph data, a lot of research has been focusing on hierarchical pooling layers on graphs. Graphs usually contains lots of important information from their hierarchical structures and it’s useful to capture them for graph-level classification tasks.

  • Edge-Conditioned Convolution (ECC)

  • DIFFPOOL

Ryan

Ryan

Data Scientist

Leave a Reply