Formal language theory defines a language as a set of strings, each of which is a sequence of elements from a finite alphabet. The English language is a set of strings, each of which is derived from a finite of 26 alphabets. Formal language theory defines classes of languages and their computational properties. We are very interested in the complexity of solving the membership problem, which is the problem of determining whether a string is in a language. We will be exploring three classes of formal languages:

  1. Regular languages
  2. Context-free languages
  3. Mildly context-sensitive languages

Formal language theory can be useful in designing formal languages that capture as many properties of the natural language as possible. The membership problem can be generalised to the problems of scoring strings for their acceptability (language modelling) and of transducing one string into another (translation).



Data Scientist

Leave a Reply