What is a markov chain?
A markov chain is a model that models the probabilities of sequences of random variables (states), each of which can take on values from different set. The sets can be words, tags, or anything symbolic. A markov chain has the assumption that we only need to use the current state to predict future sequences. The states before the current state have no impact on the future states except through the current state. In another words, the Markov assumption is that when predicting the future, only the present matters and the past doesn’t matter.
A markov chain has the following components:

A set of N states

A transition probability matrix, representing the probability of moving from state i to state j subject to the constraint that all outwards edges must sum up to 1

An initial probability distribution over states. The distribution showcase the probability that the markov chain will start at a certain state. Some states might have probability of 0 because they cannot be initial states
What is a hidden event?
Hidden events are events that we are interested but cannot directly observe them. For example, the task of tagging words. We can observe words but we are interested in the tags. The tags, in this example, are hidden because we cannot observe them directly.
What is a hidden markov model (HMM)?
HMM allows us to model both observed AND hidden events (as causal factors) probabilistically. An HMM has the following components:

A set of N states

A transition probability matrix

A sequence of T observations

A sequence of observation likelihoods (emission probabilities), representing the probabilities of observations

An initial probability distribution
What are the two assumptions of firstorder HMM?

The probability of a particular state depends only on the previous state

The probability of an output observation depends only on the state that produces the observation
The figure below showcase an example of a HMM for the ice cream task where the goal is to find the “hidden” sequence of weather states that caused Jason to eat ice cream. The number of ice creams eaten per day is out sequence of observations.
What are the three fundamental problems governing the HMM according to Rabiner (1989)?

Likelihood. Given an HMM and an observation sequence, determine the likelihood of observation given the HMM

Decoding. Given an observation sequence and an HMM, discover the best hidden state sequence

Learning. Given an observation sequence and the set of states in HMM, learn the optimal parameters for HMM