Second-order and higher-order theories
Let us continue our exploration of the main kinds of
formalisms for theories:
Second-order theories (or: theories of second-order logic)
are theories with one type X, called the base type,
and one or more others as follows, depending on the precise kind of
second-order logic :
- Full second-order logic admits several other types that
represent the sets of all structures (operations and relations)
in X with any arity: P(X), X X,
P(X×X), X X×X, P(X×X×X)...
- Monadic second-order logic only keeps in this list the
powerset of X (set of all unary relations on X).
More generally from several base types we can consider the sets
of all structures of any given kinds between them (numbers of
arguments of each type, type of results). So the general idea of
second-order theories is that they can express, for any types X,Y
of objects, formulas of the form «For every subset of X...»,
and thus also (equivalently) of the form «For every function from
X to Y...».
Second-order theories can be interpreted as first-order theories,
where each type K of structures receives its role by one
structure symbol of that kind with one additional argument of type
K, as explained in previous sections. Indeed, currying this
symbol gives a function from K to the intended set of all
such structures (the powerset or exponentiation). The obstacle
against the full formalization of second-order theories as
first-order theories, is that we cannot generally express the
bijectivity of this function by axioms: its injectivity can be
easily expressed in first-order logic in the form of the below
separation axiom, but its surjectivity can't.
Weak second-order theories, separation axioms
In a first-order theory, a type K will play the role of a set
of structures of some kind with arity n over some
other types (called base types), i.e. an arbitrary subset of the set
of all structures of that kind, if we have
The separation axiom means that its curried form with domain K
is injective, so that it gives K the role of a set of
operations or relations among base types. Then, the notation S(k,
x1,...,xn) may be replaced by (x1,...,xn)
∈ k if a relation, or k(x1,...,xn)
if an operation.
- One structure symbol S with 1 argument of type K,
and n other argument(s) (and values if it is an
operation) among base types.
- The following axiom (or theorem) of separation of K by
- If S is a relation :
- If S is an operation :
⇔ S(k', x1,..., xn))
⇒ k = k'
= S(k', x1,...,xn))
⇒ k = k'
Theories with such types (sets of structures) will be called weak
Translating second-order theories into first-order ones
only found well-defined in first-order logic. To get a concept
(rules, algorithm) of proof for a second-order theory, we need to
translate it into a first-order theory, where the rules of
first-order logic are applied. The problem is, first-order logic
cannot fully express (determine) the concept of which is the true
powerset of a given set.
But different ways to translate a second-order theory into
first-order ones, can express weaker or stronger requirements on the
interpretations of powersets, and thus eliminate more «wrong models»
from consideration. We shall list 3 of them, from the weakest (that
requires «few» subsets, and thus can prove «few» theorems) to the
strongest (that requires «many» subsets and thus can prove «many»
theorems); they will be applied to the case of arithmetic in the
The first-order expression by a schema of axioms (weakest
This only applies to theories involving the second-order type, not
in any structure, but only in axioms starting with a universal
quantifier on it, i.e. «For all sets...» (or «For all
operations...»). The idea is to interpret the variable structure as
«only» ranging over all structures definable with parameters, by an
expression in the first-order language that only uses the base
The method is to eliminate the second-order type in such an axiom,
by replacing this axiom by a schema of axioms, each with a different
defining expression to replace that variable.
The theory with stable power type (Henkin semantics)
Consider the case of a monadic second-order theory (for example;
other cases are similar). We shall express it as a first-order
theory with 2 types X, Y. In addition to the
separation axiom, the idea that Y = P(X)
will be expressed by a schema of axioms, called the comprehension
axioms, expressing that all classes of elements of X
(i.e. subsets of X defined by formulas with parameters in
this theory, as in sections 1.5 and 1.7), belong to Y.
They are written, for any formula F with at least one free
variable of type X:
∀(parameters), ∃A∈Y, ∀x∈X
, x ∈ A ⇔ F(x, parameters).
These axioms are essentially equivalent to those that defined the
set-builder we introduced for set theory (see 1.8, 1.11), except
that this symbol is not needed (it can be seen as an
In fact, all comprehension axioms without quantifiers on Y
but only using bound variables in X and parameters in both
X and Y, are satisfied by taking as Y the
set of all classes of elements of X defined with
parameters in X only. (What about those using another
structure on Y in their defining formula ?)
So, what makes this construction more powerful than the previous
one, apart from the ability to use variables of type Y in
structures and other axioms, is the requirement to also admit in Y
all subsets of X defined using quantifiers on variables of
type Y. This goes beyond any concept of «definable subsets
of X», and cannot even be qualified as a property of these
subsets, since its use of Y in defining formulas makes
this condition of definability circular.
Set theoretical interpretation (strongest method)
The expression of any theory inside set theory, initially
introduced in 1.3 for first-order theories, can be extended to
second-order theories, interpreting the notions of «all subsets»
and «all operations» by the powerset and exponentiation operators
of set theory. Then the rules of proof are those of set theory
once expressed as a first-order theory (see 1.9, 1.10, 1.11).
This framework is the one expressing the intended meaning of
second-order theories as deterministically involving the full set
of all structures of each given kind between given sets: the tools
of set theory ensure that this powerset will include all
structures that «can be found»; they interpret it in a model of
set theory which they implicitly regard as fixed (assuming that we
are «living in it»). However, since set theory is itself a
first-order theory, this formal appearance of uniquely determining
these powersets is not real but simulated, as follows: from any
bijection between two sets X and X' we can define
the image in X' of any subset of X, which obliges
P(X) and P(X') to conform to each
other. Then, the view of diverse models of second-order theories
as encompassed in a fixed model of set theory, obliges their
second-order types to conform to each other when their base types
are in bijection. This correspondence between all considered
models gives the appearance of determination of second-order types
from base types.
In this context, for a fixed X, any set Y with a
relation between X and Y, can be seen as a
relational system with language the set X of unary
relation symbols. In the category of such systems but only
accepting the embeddings as morphisms, the final objects are
those Y which are copies of the "true powerset" of X.
This method is the most powerful approach, as the available
variables and tools for the defining formulas that can find
subsets of X to bring into P(X), are
extended much beyond the mere sets X and P(X),
up to whatever range of other sets that this model of set theory
may contain. However this is still not any kind of specific «most
powerful formalization» because it is itself not fixed, as it
depends on the choice of an axiomatic system for set theory, while
there actually cannot be such a thing as «the most powerful
axiomatic system for set theory» (we can always find a stronger
Semantic completeness, logical incompleteness
A second-order theory will be called semantically complete
if, from a set theoretical viewpoint, all the models that «correctly
interpret the powerset» are isomorphic to each other.
According to the Löwenheim-Skolem theorem, first-order theories
describing infinite systems cannot be semantically complete
(first-order logic cannot prevent the coexistence of multiple
non-isomorphic models that «incorrectly interpret the powerset», and
thus would have different second-order properties if interpreted by
the external «true powersets»).
A first-order theory will be called logically complete if
all its closed formulas are logically decidable (provable or
refutable). This is naturally meaningful for first-order theories
where the Completeness
Theorem makes the concept of provability unambiguous, but not
for second-order theories where the rules of proof remain to be
specified. However it is still a posteriori meaningful there, in the
Some semantically complete second-order theories (geometry,
arithmetic with only addition and no multiplication) are already
logically complete by the weakest method, as all models of this
weakest first-order expression are elementarily equivalent (they
have the same first-order expressible properties) though they are
For other theories (full arithmetic, set theory), stronger methods
will contribute to the provability of more formulas. However then,
such a quest is usually endless: the logical incompleteness of these
theories is fundamental in the sense that no fixed algorithm can
ever suffice to decide all their formulas either (may this algorithm
anyhow come from the rules of first-order logic, e.g. by a
translation into some set theory, or not).
Higher-order theories generalize second-order theories, by
having a list of types that can be enumerated starting from a
given list of base types, and successively making other types by
«second-order constructions» from previous types, i.e.
constructions that may include not only products (as allowed by
the constructions of
first-order logic), but also powerset («Y is the set
of all subsets of X») and exponentiation («Z is the
set of all functions from X to Y»). Saying that
they are successive constructions, means that there is an order
between types such that each type of objects said to be made of
«all the..» comes after those of which it is the powerset
Higher-order theories can be translated into second-order theories
as follows. Take as the list of base types, all types that are
used in any construction of powerset or exponentiation. Then in
the second-order theory over this list of base types, a good list
of structures and first-order axioms will express all intended
relations between base types. Namely, the relation Y = P(X)
will be formalized by a structure of bijection between Y
This is the name of the first-order theory we described as "with
stable power type" to describe arithmetic. However the list of
basic structures needs to be specified.
The following 3 presentations of second-order arithmetic as
first-order theories (with axioms lists implicitly determined by
the previous explanations) are equivalent (definable from each
- With types ℕ and P(ℕ×ℕ) and with structures 0 and S
- With types ℕ and ℕℕ and with structures 0 and S
- With types ℕ and P(ℕ) and with structures 0, + and ⋅ .
We get ℕℕ as the set of functional graphs (subset of P(ℕ×ℕ))
Using ℕℕ we can define sequences of integers by recursion
; addition and multiplications come as particular cases.
Using addition and multiplication we can define a bijection between
ℕ and ℕ×ℕ by (x,y) ↦ x + (x + y)⋅(x
+ y+1)/2, so that P(ℕ) can be used in the role of P(ℕ×ℕ).
Second-order arithmetic as a possible foundation for mathematics
Most ordinary mathematics and all physics can be expressed in the
theory of second order arithmetic expressed above.
The canonical bijection P(ℕ×ℕ)≅ P(ℕ)ℕ,
allows to define recursive sequences in P(ℕ).
But other encodings can be much more cumbersome (unnatural), so that
this theoretical possibility to encode all ordinary mathematics into
second-order arithmetic, while possible, is rather far-fetched and
not practical as a way to study mathematics.
For example, arbitrary continuous function f from ℝ to ℝ
can be represented in P(ℕ×ℕ×ℕ×ℕ), and thus in P(ℕ),
by the set of 4-tuples of strictly positive natural numbers a,b,c,d
such that f(a/b − b/a) > c/d
Or, as another example of possible encoding, such that f(a
> c − d√2.
Another inconvenient is that, in mathematics, many objects studied
by a diversity of useful theories, do no aim to be anything else
than pure elements. Choices to build mathematics on a theory where
all objects are either a natural number or a set of natural numbers,
while being "right" in the fact that every set of interest has an
injection into P(ℕ), obscures this purpose.
The theory of first-order arithmetic is the one obtained
from the first-order expression by a schema of axioms, interpreting
the induction axiom as the schema of axioms ranging over the only
subsets of ℕ defined by first-order formulas (with variables and
parameters in ℕ).
But these are different theories depending on the accepted language,
by lack of direct means to define operations by recursion in this
The one with the mere symbols 0 and S is very poor.
The one including addition (Presburger arithmetic) starts to be
interesting, but is not yet "the full arithmetic" as it is unable to
The one containing addition and multiplication is very rich as,
through some difficult constructions, it is able to express
recursion, and thus any recursively defined operations such as
exponentiation of integers and factorials, as proven by Gödel at the
basis of his proof of the famous incompleteness theorem. However the
involved there and the proof of their suitability, are very
difficult, out of the scope of the present text.
We are going to show their fundamental difference by comparing the
kinds of models they admit, in the next section.
Previous : 3.8. How
mathematical theories develop
Next : Non-standard
models of Arithmetic
Back to homepage : Set Theory and
Foundations of Mathematics