Independencies in MRFs

How are independencies described by an undirected graph? In this case, it is actually very simple! Any variables x, y are dependent if they are connected by a path of unobserved variables. However, if x’s neighbours are all observed, then x is independent of all the other variables since they influence x only via its neighbours. See below the figure for illustration:

If a set of observed variables forms a cut-set between two halves of the graph, then variables in one half are independent of the other half. See below:

Conditional Random Fields

A special case of MRFs where you want to model a conditional probability distribution p(y | x). This is popular in supervised learning where you are given x and you want to predict y. This is known as structured prediction.

One of the conditional random fields applications is optical character recognition (OCR). OCR is the application whereby you scan a document and identify and output all the text data on the document.

Formally, a CRF is a Markov network which specifies a contribution distribution as follows:

\( P(y | x) = \frac{1}{Z(x)} \prod_{c \in C} \phi_c(x_c, y_c) \).

With partition function as follows:

\( Z(x) = \sum_{y \in Y}\prod_{c \in C} \phi_c(x_c, y_c) \).

The partition function depends on x. This shows that p(y | x) is a probability over y that is parametrised by x, and so a different probability function for each x. This means that a CRF results in a new MRF for each input x!



Data Scientist

Leave a Reply