A regular expression is used to define a regular language but how do we determine whether a string is in the language that it defines? This is where finite state automata comes in.
What is finite-state automata?
Finite-state automata are theoretical models of computation on regular languages that involves transitions between a finite number of states.
Finite-State Acceptor (FSA)
FSA is the most basic type of finite-state automata. It describes the computation involved in assessing whether a string is a member of a language. A FSA is a tuple M that consists of 5 elements:
- Finite alphabet of input symbols
- Finite set of states
- Starting state
- Set of final states
- Transition function – maps from a state and an input symbol (or empty string) to a set of possible resulting state
A path in M is define as a sequence of transitions, where each traverses is an arc in the transition function. The FSA will only accept a string if there is an accepting path that starts from the start state and terminates in one of the states in the final states and the entire input is consumed (no more input symbols).
This FSA is deterministic because each pair of initial state has only one resulting state. The transition function indicates that if you are in state q0 and receive symbol “a”, then it will stay in state q0. It also shows that if you are in state q1 and you receive symbol “a”, that would lead to no transition in which case the FSA is stuck and the input string would be rejected. The regular expression corresponding to the language defined by FSA is a*bb*.
How fast can we determine if a string belongs to the language?
For deterministic FSAs
O(V log V + E), where V is no. vertices and E is no. edges. We can perform the computation using the Dijkstra’s algorithm
For non-deterministic FSAs (NFSAs)
Can include multiple transitions given a symbol and state but any NFSAs can be converted into a deterministic FSA. However, the resulting automaton may have a lot more states than the original!