Second-order and higher-order theories

Let us continue our exploration of the main kinds of formalisms for theories:

Second-order theories (or: theories of second-order logic) are theories with one type X, called the base type, and one or more others as follows, depending on the precise kind of second-order logic :

More generally from several base types we can consider the sets of all structures of any given kinds between them (numbers of arguments of each type, type of results). So the general idea of second-order theories is that they can express, for any types X,Y of objects, formulas of the form «For every subset of X...», and thus also (equivalently) of the form «For every function from X to Y...».

Second-order theories can be interpreted as first-order theories, where each type K of structures receives its role by one structure symbol of that kind with one additional argument of type K, as explained in previous sections. Indeed, currying this symbol gives a function from K to the intended set of all such structures (the powerset or exponentiation). The obstacle against the full formalization of second-order theories as first-order theories, is that we cannot generally express the bijectivity of this function by axioms: its injectivity can be easily expressed in first-order logic in the form of the below separation axiom, but its surjectivity can't.

Weak second-order theories, separation axioms

In a first-order theory, a type K will play the role of a set of structures of some kind with arity n over some other types (called base types), i.e. an arbitrary subset of the set of all structures of that kind, if we have The separation axiom means that its curried form with domain K is injective, so that it gives K the role of a set of operations or relations among base types. Then, the notation S(k, x1,...,xn) may be replaced by (x1,...,xn) ∈ k if a relation, or k(x1,...,xn) if an operation.
Theories with such types (sets of structures) will be called weak second-order theories.

Translating second-order theories into first-order ones

Provability was only found well-defined in first-order logic. To get a concept (rules, algorithm) of proof for a second-order theory, we need to translate it into a first-order theory, where the rules of first-order logic are applied. The problem is, first-order logic cannot fully express (determine) the concept of which is the true powerset of a given set.
But different ways to translate a second-order theory into first-order ones, can express weaker or stronger requirements on the interpretations of powersets, and thus eliminate more «wrong models» from consideration. We shall list 3 of them, from the weakest (that requires «few» subsets, and thus can prove «few» theorems) to the strongest (that requires «many» subsets and thus can prove «many» theorems); they will be applied to the case of arithmetic in the next section.

The first-order expression by a schema of axioms (weakest method)

This only applies to theories involving the second-order type, not in any structure, but only in axioms starting with a universal quantifier on it, i.e. «For all sets...» (or «For all operations...»). The idea is to interpret the variable structure as «only» ranging over all structures definable with parameters, by an expression in the first-order language that only uses the base type(s).
The method is to eliminate the second-order type in such an axiom, by replacing this axiom by a schema of axioms, each with a different defining expression to replace that variable.

The theory with stable power type (Henkin semantics)

Consider the case of a monadic second-order theory (for example; other cases are similar). We shall express it as a first-order theory with 2 types X, Y. In addition to the separation axiom, the idea that Y = P(X) will be expressed by a schema of axioms, called the comprehension axioms, expressing that all classes of elements of X (i.e. subsets of X defined by formulas with parameters in this theory, as in sections 1.5 and 1.7), belong to Y. They are written, for any formula F with at least one free variable of type X:

∀(parameters), ∃AY, ∀xX , xAF(x, parameters).

These axioms are essentially equivalent to those that defined the set-builder we introduced for set theory (see 1.8, 1.11), except that this symbol is not needed (it can be seen as an abbreviation).

In fact, all comprehension axioms without quantifiers on Y but only using bound variables in X and parameters in both X and Y, are satisfied by taking as Y the set of all classes of elements of X defined with parameters in X only. (What about those using another structure on Y in their defining formula ?)
So, what makes this construction more powerful than the previous one, apart from the ability to use variables of type Y in structures and other axioms, is the requirement to also admit in Y all subsets of X defined using quantifiers on variables of type Y. This goes beyond any concept of «definable subsets of X», and cannot even be qualified as a property of these subsets, since its use of Y in defining formulas makes this condition of definability circular.

Set theoretical interpretation (strongest method)

The expression of any theory inside set theory, initially introduced in 1.3 for first-order theories, can be extended to second-order theories, interpreting the notions of «all subsets» and «all operations» by the powerset and exponentiation operators of set theory. Then the rules of proof are those of set theory once expressed as a first-order theory (see 1.9, 1.10, 1.11).
This framework is the one expressing the intended meaning of second-order theories as deterministically involving the full set of all structures of each given kind between given sets: the tools of set theory ensure that this powerset will include all structures that «can be found»; they interpret it in a model of set theory which they implicitly regard as fixed (assuming that we are «living in it»). However, since set theory is itself a first-order theory, this formal appearance of uniquely determining these powersets is not real but simulated, as follows: from any bijection between two sets X and X' we can define the image in X' of any subset of X, which obliges P(X) and P(X') to conform to each other. Then, the view of diverse models of second-order theories as encompassed in a fixed model of set theory, obliges their second-order types to conform to each other when their base types are in bijection. This correspondence between all considered models gives the appearance of determination of second-order types from base types.
In this context, for a fixed X, any set Y with a relation between X and Y, can be seen as a relational system with language the set X of unary relation symbols. In the category of such systems but only accepting the embeddings as morphisms, the final objects are those Y which are copies of the "true powerset" of X.

This method is the most powerful approach, as the available variables and tools for the defining formulas that can find subsets of X to bring into P(X), are extended much beyond the mere sets X and P(X), up to whatever range of other sets that this model of set theory may contain. However this is still not any kind of specific «most powerful formalization» because it is itself not fixed, as it depends on the choice of an axiomatic system for set theory, while there actually cannot be such a thing as «the most powerful axiomatic system for set theory» (we can always find a stronger one).

Semantic completeness, logical incompleteness

A second-order theory will be called semantically complete if, from a set theoretical viewpoint, all the models that «correctly interpret the powerset» are isomorphic to each other.
According to the Löwenheim-Skolem theorem, first-order theories describing infinite systems cannot be semantically complete (first-order logic cannot prevent the coexistence of multiple non-isomorphic models that «incorrectly interpret the powerset», and thus would have different second-order properties if interpreted by the external «true powersets»).

A first-order theory will be called logically complete if all its closed formulas are logically decidable (provable or refutable). This is naturally meaningful for first-order theories where the Completeness Theorem makes the concept of provability unambiguous, but not for second-order theories where the rules of proof remain to be specified. However it is still a posteriori meaningful there, in the following sense.

Some semantically complete second-order theories (geometry, arithmetic with only addition and no multiplication) are already logically complete by the weakest method, as all models of this weakest first-order expression are elementarily equivalent (they have the same first-order expressible properties) though they are not isomorphic.
For other theories (full arithmetic, set theory), stronger methods will contribute to the provability of more formulas. However then, such a quest is usually endless: the logical incompleteness of these theories is fundamental in the sense that no fixed algorithm can ever suffice to decide all their formulas either (may this algorithm anyhow come from the rules of first-order logic, e.g. by a translation into some set theory, or not).

Higher-order theories

Higher-order theories generalize second-order theories, by having a list of types that can be enumerated starting from a given list of base types, and successively making other types by «second-order constructions» from previous types, i.e. constructions that may include not only products (as allowed by the constructions of first-order logic), but also powerset («Y is the set of all subsets of X») and exponentiation («Z is the set of all functions from X to Y»). Saying that they are successive constructions, means that there is an order between types such that each type of objects said to be made of «all the..» comes after those of which it is the powerset or exponentiation.
Higher-order theories can be translated into second-order theories as follows. Take as the list of base types, all types that are used in any construction of powerset or exponentiation. Then in the second-order theory over this list of base types, a good list of structures and first-order axioms will express all intended relations between base types. Namely, the relation Y = P(X) will be formalized by a structure of bijection between Y and P(X).

Second-order arithmetic

This is the name of the first-order theory we described as "with stable power type" to describe arithmetic. However the list of basic structures needs to be specified.

The following 3 presentations of second-order arithmetic as first-order theories (with axioms lists implicitly determined by the previous explanations) are equivalent (definable from each other):

  1. With types ℕ and P(ℕ×ℕ) and with structures 0 and S ;
  2. With types ℕ and ℕ and with structures 0 and S ;
  3. With types ℕ and P(ℕ) and with structures 0, + and ⋅ .
Proof :
We get ℕ as the set of functional graphs (subset of P(ℕ×ℕ))
Using ℕ we can define sequences of integers by recursion ; addition and multiplications come as particular cases.
Using addition and multiplication we can define a bijection between ℕ and ℕ×ℕ by (x,y) ↦ x + (x + y)⋅(x + y+1)/2, so that P(ℕ) can be used in the role of P(ℕ×ℕ).

Second-order arithmetic as a possible foundation for mathematics

Most ordinary mathematics and all physics can be expressed in the theory of second order arithmetic expressed above.
The canonical bijection P(ℕ×ℕ)≅ P(ℕ), allows to define recursive sequences in P(ℕ).
But other encodings can be much more cumbersome (unnatural), so that this theoretical possibility to encode all ordinary mathematics into second-order arithmetic, while possible, is rather far-fetched and not practical as a way to study mathematics.

For example, arbitrary continuous function f from ℝ to ℝ can be represented in P(ℕ×ℕ×ℕ×ℕ), and thus in P(ℕ), by the set of 4-tuples of strictly positive natural numbers a,b,c,d such that f(a/bb/a) > c/dd/c.

Or, as another example of possible encoding, such that f(ab2) > cd2.

Another inconvenient is that, in mathematics, many objects studied by a diversity of useful theories, do no aim to be anything else than pure elements. Choices to build mathematics on a theory where all objects are either a natural number or a set of natural numbers, while being "right" in the fact that every set of interest has an injection into P(ℕ), obscures this purpose.

First-order aritmetic

The theory of first-order arithmetic is the one obtained from the first-order expression by a schema of axioms, interpreting the induction axiom as the schema of axioms ranging over the only subsets of ℕ defined by first-order formulas (with variables and parameters in ℕ).

But these are different theories depending on the accepted language, by lack of direct means to define operations by recursion in this language :
The one with the mere symbols 0 and S is very poor.
The one including addition (Presburger arithmetic) starts to be interesting, but is not yet "the full arithmetic" as it is unable to define multiplication.
The one containing addition and multiplication is very rich as, through some difficult constructions, it is able to express recursion, and thus any recursively defined operations such as exponentiation of integers and factorials, as proven by Gödel at the basis of his proof of the famous incompleteness theorem. However the formulas involved there and the proof of their suitability, are very difficult, out of the scope of the present text.

We are going to show their fundamental difference by comparing the kinds of models they admit, in the next section.
Previous : 3.8. How mathematical theories develop
Next : Non-standard models of Arithmetic
Back to homepage : Set Theory and Foundations of Mathematics