First-order logic

First-order logic (FOL) is a language in symbolic science, which is used by mathematicians, philosophers, linguists, and computer scientists. It goes by many names, including: first-order predicate calculus (FOPC), the lower predicate calculus, the language of first-order logic or predicate logic. The most used name is however FOL, pronounced ef-oh-el. FOL is a system of deduction extending propositional logic by the ability to express relations between individuals (e.g. people, numbers, and "things") more generally.

Propositional logic is not adequate for formalizing valid arguments that rely on the internal structure of the propositions involved. For example, a translation of the valid argument


 * All men are mortal
 * Socrates is a man
 * Therefore, Socrates is mortal

into propositional logic yields


 * A
 * B
 * &there4; C

which is invalid (&there4; means "therefore"). The rest of the article explains how FOL is able to handle these sorts of arguments.

The atomic sentences of first-order logic have the form P(t1, ..., tn) (a predicate with one or more "arguments") rather than being propositional letters as in propositional logic. This is usually written without parentheses or commas, as below.

The new ingredient of first-order logic not found in propositional logic is quantification: where φ is any (well-formed) formula, the new constructions ∀x φ and ∃x φ — read "for all x, φ" and "for some x, φ" — are introduced, where x is an individual variable whose range is the set of individuals of some given universe of discourse (or domain). For example, if the universe consists solely of people, then x ranges over people. For convenience, we write φ as φ(x) to show that it contains only the variable x free and, for b a member of the universe, we let φ[b] express that b satisfies (i.e. has the property expressed by) &phi;. Then ∀x φ(x) states that φ[b] is true for every b in the universe, and ∃x φ(x) means that there is a b (in the universe) such that φ[b] holds.

The argument about Socrates can be formalized in first-order logic as follows. Let the universe of discourse be the set of all people, living and deceased, and let Man(x) be a predicate (which, informally, means that the person represented by variable x is a man) and Mortal(x) be a second predicate. Then the argument above becomes
 * ∀ x (Man(x) → Mortal(x))
 * Man(Socrates)
 * &there4; Mortal(Socrates)

A literal translation of the first line would be "For all x, if x is described by 'Man', x must also be described by 'Mortal'." The second line states that the predicate "Man" applies to Socrates, and the third line translates to "Therefore, the description 'Mortal' applies to Socrates."

Note that in first-order logic, Man(x) is not a function, as functions convert inputs to outputs through a specific process. Man(x) simply means that x is described by "Man", or that x is (a) "Man", or that x falls under the category of "Man".

First-order logic has sufficient expressive power for the formalization of virtually all of mathematics. A first-order theory consists of a set of axioms (usually finite or recursively enumerable) and the statements deducible from them. The usual set theory ZFC is an example of a first-order theory, and it is generally accepted that all of classical mathematics can be formalized in ZFC. There are other theories that are commonly formalized independently in first-order logic (though they do admit implementation in set theory) such as Peano arithmetic.

Defining first-order logic
A predicate calculus consists of
 * formation rules (i.e. recursive definitions for forming well-formed formulas).
 * transformation rules (i.e. inference rules for deriving theorems).
 * axioms or axiom schemata (possibly countably infinite).

The axioms considered here are logical axioms which are part of classical FOL. Further, non-logical axioms are added to yield specific first-order theories that are based on the axioms of classical FOL (and hence are called classical first-order theories, such as classical set-theory). The axioms of first-order theories are not regarded as truths of logic per se, but rather as truths of the particular theory that usually has associated with it an intended interpretation of its non-logical symbols. (See an analogous idea at logical versus non-logical symbols.) Classical FOL does not have associated with it an intended interpretation of its non-logical vocabulary (except arguably a symbol denoting identity, depending on whether one regards such a symbol as logical).

When the set of axioms is infinite, it is required that there be an algorithm which can decide for a given well-formed formula whether or not it is an axiom. There should also be an algorithm which can decide whether a given application of an inference rule is correct or not.

It is important to note that FOL can be formalized in many equivalent ways; there is nothing canonical about the axioms and rules of inference given here. There are infinitely many equivalent formalizations all of which yield the same theorems and non-theorems, and all of which have equal right to the title 'FOL'.

Vocabulary
The "vocabulary" is composed of
 * 1) A set of predicate variables (or relations) each with some valence (or arity) ≥1, which are often denoted by uppercase letters P, Q, R,....
 * 2) A set of constants, often denoted by lowercase letters at the beginning of the alphabet a, b, c,....
 * 3) A set of functions, each of some valence ≥ 1, which are often denoted by lowercase letters f, g, h,....
 * 4) An infinite set of variables, often denoted by lowercase letters at the end of the alphabet x, y, z,....
 * 5) Symbols denoting logical operators (or connectives): $$\neg$$ (logical not), $$\wedge$$ (logical and), $$\vee$$ (logical or), → (logical conditional), ↔ (logical biconditional).
 * 6) Symbols denoting quantifiers: $$\forall$$ (universal quantification), $$\exists$$ (existential quantification).
 * 7) Left and right parenthesis.
 * 8) An identity or equality symbol = is sometimes but not always included in the vocabulary.

There are several minor variations listed below:
 * The set of primitive symbols (operators and quantifiers) often varies. Some symbols may be omitted as primitive and taken as abbreviations instead; e.g. (P ↔ Q) is an abbreviation for (P → Q) $$\wedge$$ (Q → P). On the other hand, it is possible to include other operators as primitive, such as the truth constants ⊤ for "true" and ⊥ for "false" (these are operators of valence 0), or the Sheffer stroke (P | Q). The minimum number of primitive symbols needed is one, but if we restrict ourselves to the operators listed above, we need three; for example, ¬, ∧, and ∀ suffice.
 * Some older books and papers use the notation φ ⊃ ψ for φ → ψ, ~φ for ¬φ, φ & ψ for φ ∧ ψ, and a wealth of notations for quantifiers; e.g., ∀x φ may be written as (x)φ.
 * Equality is sometimes considered to be a part of first order logic; if it is, then the equality symbol is included in the vocabulary, and behaves syntactically as a binary predicate. This case is sometimes called first order logic with equality.
 * Constants are really the same as functions of valence 0, so it would be possible and convenient to omit constants and allow functions to have any valence. But it is traditional to use the term "function" only for functions of valence at least 1.
 * In the definition above relations must have valence at least 1. It is possible to allow relations of valence 0; these could be considered as propositional variables.
 * There are many different conventions about where to put parentheses; for example, one might write ∀x or (∀x). Sometimes one uses colons or full stops instead of parentheses to make formulas unambiguous. One interesting but rather unusual convention is "Polish notation", where one omits all parentheses, and writes ∧, ∨, and so on in front of their arguments rather than between them. Polish notation is compact and elegant, but rare because it is hard for humans to read it.
 * A technical observation (of which one can make what one will, philosophically) is that if there is a function symbol of arity 2 representing an ordered pair (or predicate symbols of arity 2 representing the projection relations of an ordered pair) then one can dispense entirely with functions or predicates of arity > 2. Of course the pair or projections need to satisfy the natural axioms.

The sets of constants, functions, and relations are usually considered to form a language, while the variables, logical operators, and quantifiers are usually considered to belong to the logic. For example, the language of group theory consists of one constant (the identity element), one function of valence 1 (the inverse) one function of valence 2 (the product), and one relation of valence 2 (equality), which would be omitted by authors who include equality in the underlying logic.

Formation rules
The formation rules define the terms, formulas and free variables in them as follows.

Terms: The set of terms is recursively defined by the following rules:
 * 1) Any constant is a term.
 * 2) Any variable is a term.
 * 3) Any expression f(t1,...,tn) of n ≥ 1 arguments (where each argument ti is a term and f is a function symbol of valence n) is a term.
 * 4) Closure clause: Nothing else is a term.

Well-formed Formulas: The set of well-formed formulas (usually called wffs or just formulas) is recursively defined by the following rules:
 * 1) Simple and complex predicates If P is a relation of valence n ≥ 1 and the ai are terms  then P(a1,...,an) is well-formed. If equality is considered part of logic, then (a1 = a2) is well formed. All such formulas are said to be atomic.
 * 2) Inductive Clause I: If φ is a wff, then ¬φ is a wff.
 * 3) Inductive Clause II: If φ and ψ are wffs, then (φ → ψ) is a wff.
 * 4) Inductive Clause III: If φ is a wff and x is a variable, then ∀x  φ is a wff.
 * 5) Closure Clause: Nothing else is a wff.

Free Variables:
 * 1) Atomic formulas If φ is an Atomic formula then x is free in φ if and only if x occurs in φ.
 * 2) Inductive Clause I: x is free in ¬φ if and only if x is free in φ.
 * 3) Inductive Clause II: x is free in (φ → ψ) if and only if x is free in φ or x is free in ψ.
 * 4) Inductive Clause III: x is free in ∀y  φ if and only if x is free in φ and xǂy.
 * 5) Closure Clause: if x is not free in φ then it is bound.

Since ¬(φ → ¬ψ) is logically equivalent to (φ ∧ ψ), (φ ∧ ψ) is often used as a short hand. The same principle is behind (φ ∨ ψ) and (φ ↔ ψ). Also ∃x φ is a short hand for ¬∀y ¬φ. In practice, if P is a relation of valence 2, we often write "a P b" instead of "P a b"; for example, we write 1 &lt; 2 instead of &lt;(1 2). Similarly if f is a function of valence 2, we sometimes write "a f b" instead of "f(a b)"; for example, we write 1 + 2 instead of +(1 2). It is also common to omit some parentheses if this does not lead to ambiguity.

Sometimes it is useful to say that "P(x) holds for exactly one x", which can be expressed as ∃!x P(x). This can also be expressed as ∃x (P(x) ∧ ∀y (P(y) → (x = y))).

Examples: The language of ordered abelian groups has one constant 0, one unary function &minus;, one binary function +, and one binary relation ≤. So
 * 0, x, y are atomic terms
 * +(x, y), +(x, +(y, &minus;(z))) are terms, usually written as x + y, x + y &minus; z
 * =(+(x, y), 0), ≤(+(x, +(y, &minus;(z))), +(x, y)) are atomic formulas, usually written as x + y = 0, x + y - z ≤ x + y,
 * (∀x ∃y ≤( +(x, y), z)) ∧ (∃x =(+(x, y), 0)) is a formula, usually written as (∀x ∃y x + y ≤ z) ∧ (∃x x + y = 0).

Substitution
If t is a term and φ(x) is a formula possibly containing x as a free variable, then v φ(t) is defined to be the result of replacing all free instances of x by t, provided that no free variable of t becomes bound in this process. If some free variable of t becomes bound, then to substitute t for x it is first necessary to change the names of bound variables of φ to something other than the free variables of t. To see why this condition is necessary, consider the formula φ(x) given by ∀y y ≤ x ("x is maximal"). If t is a term without y as a free variable, then φ(t) just means t is maximal. However if t is y the formula φ(y) is ∀y y ≤ y which does not say that y is maximal. The problem is that the free variable y of t (=y) became bound when we substituted y for x in φ(x). So to form φ(y) we must first change the bound variable y of φ to something else, say z, so that φ(y) is then ∀z z ≤ y. Forgetting this condition is a notorious cause of errors.

Equality
There are several different conventions for using equality (or identity) in first order logic. This section summarizes the main ones. The various conventions all give essentially the same results with about the same amount of work, and differ mainly in terminology.


 * The most common convention for equality is to include the equality symbol as a primitive logical symbol, and add the axioms for equality to the axioms for first order logic. The equality axioms are
 * x = x
 * x = y → f(...,x,...) = f(...,y,...) for any function f
 * x = y → (P(...,x,...) → P(...,y,...)) for any relation P (including = itself)


 * The next most common convention is to include the equality symbol as one of the relations of a theory, and add the equality axioms to the axioms of the theory. In practice this is almost indistinguishable from the previous convention, except in the unusual case of theories with no notion of equality. The axioms are the same, and the only difference is whether one calls some of them logical axioms or axioms of the theory.
 * In theories with no functions and a finite number of relations, it is possible to define equality in terms of the relations, by defining the two terms s and t to be equal if any relation is unchanged by changing s to t in any argument. For example, in set theory with one relation ∈, we would define s = t to be an abbreviation for ∀x (s ∈ x ↔ t ∈ x) ∧ ∀x (x ∈ s ↔ x ∈ t). This definition of equality then automatically satisfies the axioms for equality.
 * In some theories it is possible to give ad hoc definitions of equality. For example, in a theory of partial orders with one relation ≤ we could define s = t to be an abbreviation for s ≤ t ∧ t ≤ s.

Inference rules
The inference rule modus ponens is the only one required from propositional logic for the formalization given here. It states that if φ and φ → ψ are both proved, then one can deduce ψ.

The inference rule called Universal Generalization is characteristic of the predicate calculus. It can be stated as
 * if $$\vdash \varphi$$, then $$\vdash \forall x \, \varphi$$

where φ is supposed to stand for an already-proven theorem of predicate calculus.

Notice that Generalization is analogous to the Necessitation Rule of modal logic, which is
 * if $$\vdash P$$, then $$\vdash \Box P$$.

Quantifier axioms
The following four logical axioms characterize a predicate calculus: These are actually axiom schemata: the expression W stands for any wff in which x is not free, and the expression Z(x) stands for any wff with the additional convention that Z(t) stands for the result of substitution of the term t for x in Z(x).
 * PRED-1: (∀x Z(x)) → Z(t)
 * PRED-2: Z(t) → (∃x Z(x))
 * PRED-3: (∀x (W → Z(x))) → (W → ∀x Z(x))
 * PRED-4: (∀x (Z(x) → W)) → (∃x Z(x) → W)

Predicate calculus
The predicate calculus is an extension of the propositional calculus that defines which statements of first order logic are provable. It is a formal system used to describe mathematical theories. If the propositional calculus is defined with a suitable set of axioms and the single rule of inference modus ponens (this can be done in many different ways), then the predicate calculus can be defined by appending some additional axioms and the additional inference rule "universal generalization". More precisely, as axioms for the predicate calculus we take: A sentence is defined to be provable in first order logic if it can be obtained by starting with the axioms of the predicate calculus and repeatedly applying the inference rules "modus ponens" and "universal generalization".
 * All tautologies from the propositional calculus (with the proposition variables replaced by formulas).
 * The axioms for quantifiers, given above.
 * The axioms for equality given above, if equality is regarded as a logical concept.

If we have a theory T (a set of statements, called axioms, in some language) then a sentence φ is defined to be provable in the theory T  if a ∧ b ∧ ... → φ is provable in first order logic, for some finite set of axioms a, b,... of the theory T.

One apparent problem with this definition of provability is that it seems rather ad hoc: we have taken some apparently random collection of axioms and rules of inference, and it is unclear that we have not accidentally missed out some vital axiom or rule. Gödel's completeness theorem assures us that this is not really a problem: the theorem states that any statement true in all models is provable in first order logic. In particular, any reasonable definition of "provable" in first order logic must be equivalent to the one above (though it is possible for the lengths of proofs to differ vastly for different definitions of provability).

There are many different (but equivalent) ways to define provability. The definition above is a typical example of a "Hilbert style" calculus, which has a lot of different axioms but very few rules of inference. The "Gentzen style" predicate calculus differs in that it has very few axioms but many rules of inference.

Identities

 * $$\lnot \forall x \, P(x) \Leftrightarrow \exists x \, \lnot P(x)$$
 * $$\lnot \exists x \, P(x) \Leftrightarrow \forall x \, \lnot P(x)$$
 * $$\forall x \, \forall y \, P(x,y) \Leftrightarrow \forall y \, \forall x \, P(x,y)$$
 * $$\exists x \, \exists y \, P(x,y) \Leftrightarrow \exists y \, \exists x \, P(x,y)$$
 * $$\forall x \, P(x) \land \forall x \, Q(x) \Leftrightarrow \forall x \, (P(x) \land Q(x)) $$
 * $$\exists x \, P(x) \lor \exists x \, Q(x) \Leftrightarrow \exists x \, (P(x) \lor Q(x)) $$
 * $$P \land \exists x \, Q(x) \Leftrightarrow \exists x \, (P \land Q(x)) $$ (where the variable $$x$$ must not occur free in $$P$$)
 * $$P \lor \forall x \, Q(x) \Leftrightarrow \forall x \, (P \lor Q(x)) $$ (where the variable $$x$$ must not occur free in $$P$$)

Inference rules

 * $$\exists x \, \forall y \, P(x,y) \Rightarrow \forall y \, \exists x \, P(x,y)$$
 * $$\forall x \, P(x) \lor \forall x \, Q(x) \Rightarrow \forall x \, (P(x) \lor Q(x)) $$
 * $$\exists x \, (P(x) \land Q(x)) \Rightarrow \exists x \, P(x) \land \exists x \, Q(x)$$
 * $$\exists x \, P(x) \land \forall x \, Q(x) \Rightarrow \exists x \, (P(x) \land Q(x)) $$
 * $$\forall x \, P(x) \Rightarrow P(c)$$ (If c is a variable, then it must not already be quantified somewhere in P(x))
 * $$P(c) \Rightarrow \exists x \, P(x)$$ (x must not appear free in P(c))

Metalogical theorems of first-order logic
Some important metalogical theorems are listed below in bulleted form.
 * 1) Unlike the propositional logic, first-order logic is undecidable, provided that the language has at least one predicate of valence at least 2 that is not equality. There is no decision procedure that correctly determines whether an arbitrary formula is valid (this is related to the unsolvability of the Halting problem). This result was established independently by Church and Turing.
 * 2) The decision problem for validity is semidecidable; in other words, there is a Turing machine that when given any sentence as input, will halt if and only if the sentence is valid (true in all models).
 * 3) * As Gödel's completeness theorem shows, for any valid formula P, P is provable. Conversely, assuming consistency of the logic, any provable formula is valid.
 * 4) * For a finite or semienumerable set of axioms, the set of provable formulas can be explicitly enumerated by a Turing machine, thus the result.
 * 5) Monadic predicate logic (i.e., predicate logic with only predicates of one argument) is decidable.
 * 6) The Bernays–Schönfinkel class of first order formulas is also decidable.

Comparison with other logics

 * Typed first order logic allows variables and terms to have various types (or sorts). If there are only a finite number of types, this does not really differ much from first order logic, because one can describe the types with a finite number of unary predicates and a few axioms. Sometimes there is a special type Ω of truth values, in which case formulas are just terms of type Ω.
 * Weak second-order logic allows quantification over finite subsets.
 * Monadic second-order logic allows quantification over subsets, or in other words over unary predicates.
 * Second-order logic allows quantification over subsets and relations, or in other words over all predicates. For example, the axiom of extensionality can be stated in second-order logic as x = y ≡def ∀P (P(x) ↔ P(y)). The strong semantics of second order logic give such sentences a much stronger meaning than first-order semantics.
 * Higher order logics allows quantification over higher types than second-order logic permits. These higher types include relations between relations, functions from relations to relations between relations, etc.
 * Intuitionistic first order logic uses intuitionistic rather than classical propositional calculus; for example, ¬¬φ need not be equivalent to φ.
 * Modal logic has extra modal operators with meanings which can be characterised informally as, for example "it is necessary that φ" and "it is possible that φ".
 * In Monadic predicate calculus predicates are restricted to having only one argument.
 * Infinitary logic allows infinitely long sentences. For example, one may allow a conjunction or disjunction of infinitely many formulas, or quantification over infinitely many variables. Infinitely long sentences arise in areas of mathematics including topology and model theory.
 * First order logic with extra quantifiers has new quantifiers Qx,..., with meanings such as "there are many x such that ...". Also see branched quantification and the plural quantifiers of George Boolos and others.
 * Independence-friendly logic is characterized by branching quantifiers, which allow one to express independence between quantified variables.

Most of these logics are in some sense extensions of first order logic: they include all the quantifiers and logical operators of first order logic with the same meanings. Lindström showed first order logic has no extensions (other than itself) that satisfy both the compactness theorem and the downward Löwenheim-Skolem theorem. A precise statement of Lindström's theorem requires listing several pages of technical conditions that the logic is assumed to satisfy; for example, changing the symbols of a language should make no essential difference to which sentences are true.

First order logic in which no atomic sentence lies in the scope of more than three quantifiers, has the same expressive power as the relation algebra of Tarski and Givant (1987). They also show that FOL with a primitive ordered pair, and a relation algebra including the two ordered pair projection relations, are equivalent.