Assessment |
Biopsychology |
Comparative |
Cognitive |
Developmental |
Language |
Individual differences |
Personality |
Philosophy |
Social |

Methods |
Statistics |
Clinical |
Educational |
Industrial |
Professional items |
World psychology |

**Language:**
Linguistics ·
Semiotics ·
Speech

In mathematics, logic, and computer science, a **formal language** **L** is a set of finite-length sequences of elements drawn from a specified finite set **A** of symbols. Among the more common options that are found in applications, a formal language may be viewed as being analogous to (1) a collection of words or (2) a collection of sentences. In Case 1, the set **A** is called the *alphabet* of **L**, whose elements are called *words*. In Case 2, the set **A** is called the *lexicon* or the *vocabulary* of **L**, whose elements are then called *sentences*. In any case, the mathematical theory that treats formal languages in general is known as *formal language theory*.

Although it is common to hear the term *formal language* used in other contexts to refer to a mode of expression that is more disciplined or more precise than everyday speech, the sense of *formal language* discussed in this article is restricted to its meaning in formal language theory.

An alphabet might be $ \left \{ a , b \right \} $, and a string over that alphabet might be $ ababba $.

A typical language over that alphabet, containing that string, would be the set of all strings which contain the same number of symbols $ a $ and $ b $.

The **empty word** (that is, length-zero string) is allowed and is often denoted by $ e $, $ \epsilon $ or $ \Lambda $. While the alphabet is a finite set and every string has finite length, a language may very well have infinitely many member strings (because the length of words in it may be unbounded).

A question often asked about formal languages is "how difficult is it to decide whether a given word belongs to a particular language?" This is the domain of computability theory and complexity theory.

## ExamplesEdit

Some examples of formal languages:

- the set of all words over $ {a, b} $
- the set $ \left \{ a^{n}\right\} $, n is a natural number and $ a^{n} $ means $ a $ repeated $ n $ times
- Finite languages, such as $ {a, aa, bba} $ -
- the set of syntactically correct programs in a given programming language; or
- the set of inputs upon which a certain Turing machine halts.

## SpecificationEdit

A formal language can be specified in a great variety of ways, such as:

- Strings produced by some formal grammar (see Chomsky hierarchy);
- Strings produced by a regular expression;
- Strings accepted by some automaton, such as a Turing machine or finite state automaton;
- From a set of related YES/NO questions, those questions for which the answer is YES — see decision problem.

## OperationsEdit

Several operations can be used to produce new languages from given ones. Suppose $ L_{1} $ and $ L_{2} $ are languages over some common alphabet.

- The
*concatenation*$ L_{1}L_{2} $ consists of all strings of the form $ vw $ where $ v $ is a string from $ L_{1} $ and $ w $ is a string from $ L_{2} $. - The
*intersection*$ L_1 \cap L_2 $ of $ L_{1} $ and $ L_{2} $ consists of all strings which are contained in $ L_1 $ and also in $ L_{2} $. - The
*union*$ L_1 \cup L_2 $ of $ L_{1} $ and $ L_{2} $ consists of all strings which are contained in $ L_{1} $ or in $ L_{2} $. - The
*complement*of the language $ L_{1} $ consists of all strings over the alphabet which are not contained in $ L_{1} $. - The
*right quotient*$ L_{1}/L_{2} $ of $ L_{1} $ by $ L_{2} $ consists of all strings $ v $ for which there exists a string $ w $ in $ L_{2} $ such that $ vw $ is in $ L_{1} $. - The
*Kleene star*$ L_{1}^{*} $ consists of all strings which can be written in the form $ w_{1}w_{2}...w_{n} $ with strings $ w_{i} $ in $ L_{1} $ and $ n \ge 0 $. Note that this includes the empty string $ \epsilon $ because $ n = 0 $ is allowed. - The
*reverse*$ L_{1}^{R} $ contains the reversed versions of all the strings in $ L_{1} $. - The
*shuffle*of $ L_{1} $ and $ L_{2} $ consists of all strings which can be written in the form $ v_{1}w_{1}v_{2}w_{2}...v_{n}w_{n} $ where $ n \ge 1 $ and $ v_{1},...,v_{n} $ are strings such that the concatenation $ v_{1}...v_{n} $ is in $ L_{1} $ and $ w_{1},...,w_{n} $ are strings such that $ w_{1}...w_{n} $ is in $ L_{2} $.

## See alsoEdit

- Language for languages in general
- Syntax for the form of a language in general
- Semantics for the meanings in a language
- Natural language for languages that are not formal
- Computer language for application of formal languages in computing
- Programming language for the application of formal languages to program computers

## References & BibliographyEdit

- Hopcroft, J. & Ullman, J (1979).
*Introduction to Automata Theory, Languages, and Computation.*Addison Wesley. ISBN 0-201-02988-X

- G. Rozenberg, A. Salomaa eds. Handbook of Formal Languages. Springer-Verlag. 3 vol.

- http://icalp06.dsi.unive.it/ ICALP 2006 33rd International Colloquium on

Automata, Languages and Programming.

- http://www.cs.auckland.ac.nz/CDMTCS/conferences/dlt/DLTConfSeries.html International Conferences on Developments in Language Theory

Automata theory: formal languages and formal grammars
| |||
---|---|---|---|

Chomsky
hierarchy | Grammars
| Languages | Minimal
automaton |

Type-0 | Unrestricted | Recursively enumerable | Turing machine |

n/a | (no common name) | Recursive | Decider |

Type-1 | Context-sensitive | Context-sensitive | Linear-bounded |

Type-2 | Context-free | Context-free | Pushdown |

Type-3 | Regular | Regular | Finite |

Each category of languages or grammars is a proper superset of the category directly beneath it. |

## External linksEdit

This page uses Creative Commons Licensed content from Wikipedia (view authors). |