4.6. Second-order logic

In set theory, the powerset, with its equivalent forms of product and exponentiation, was the only set construction operator which was not justified by the set generation principle. Similarly in model theory, the powerset of a type, and the exponentiation operation between types, can be regarded as «constructions» which fail to satisfy all the criteria we stated for constructions of a type in first-order logic. Namely, they remain essentially well-determined from the viewpoint of set theory with the powerset, but their use cannot be reduced to (seen as an abbreviation of) works in the initial theory. Thus their allowance forms new logical frameworks for theories: Higher-order theories can be translated as second-order theories as follows. Take as list of base types, all types that are used in any second-order construction. Then a certain list of structures and axioms in the second-order theory over them, can express all needed relations between them. Namely, the relation Y = ℘(X) can be formalized by a structure of bijection between Y and ℘(X).

First-order expression of second-order structures

The first-order structures over a given list of sets, were the (finitary) operations and relations between them. Now a second-order structure over them, is an operation or relation using among its types some second-order constructed set(s) Y over them (the set of all structures of a given kind). So it is just a structure (operation or relation) that can be given to the second-order theory taking these sets as base types (beyond the non-structuring one already given by the second-order construction of that type Y itself, described below).
Let us focus on the simplest case (beyond second-order constants which are the first-order structures): that of a unary relation A on a second-order type Y. Like with the construction of a subset from its data as a unary relation, it can be represented by a type K with a structure of injection from K to Y, giving K the role of a subset of Y. Now this is expressible in first-order logic bypassing any use of Y (whose exhaustivity as set of all structures of its kind is inexpressible) in the following form which may be called a weak second-order theory. This will suffice for most theories, only making actual use of one side K of that unary relation on Y, ignoring its complement Y\K.
The role of type K as a set of some n-ary structures, is given by one structure symbol (interpreter) S of that kind plus one argument of type K: the curried view interprets it as a function from K to Y. To reflect this role, S(k, x1,...,xn) will be abbreviated as (x1,...,xn)∈k if a relation, or as k(x1,...,xn) if an operation. Its injectivity is expressed by the following separation axiom (separation of K by S): Such a type with this structure and this axiom plays the role of a structure in the sense of how adding this pack of components to a theory T modifies its category of models, having «enough structures», but not «enough axioms» for this pack to form a construction : for any isomorphism between 2 interpretations of T (with base types), the uniqueness of its extension (by a bijection between interpretations of the additional type) as an isomorphism between extended models is ensured, but its existence isn't (the extension of the model has enough structure to prevent any new symmetry but this "structure" remains usually "undefined" as this extension is not essentially determined as unique up to isomorphism).

Translation into first-order logic

For simplicity, let us write things for the case of only one second-order type Y, but the generalization is straightforward.
The above concept of weak second-order theory, as a pack of components to add to the first-order types, can serve as an incomplete scheme of construction of second-order types, to translate any second-order theories into a first-order one. What misses is the possibility, in the general case (for infinite sets), to fully express by first-order axioms the wish that K exhausts the intended type Y of structures of that kind, i.e. that our injection from K to Y is surjective.
It can still be partially expressed by axioms stating that this type of structures contains all those somehow definable with parameters. In the case of the powerset they are called comprehension axioms, depending on a formula F with one distinguished free variable of type X, to make K contain all classes defined by F:

∀(parameters), ∃AK, ∀xX, xAF(x, parameters).

and similarly for exponentiation (types of functions) replacing ⇔ by =. Cases of higher arities result by combining the unary case with the construction of product.
The strength of such a formalization depends on a choice of "definability" concept: what kind of formulas may F range over. The main ones will be reviewed from the weakest (a restricted comprehension scheme, admitting more models) to the strongest (wider comprehension scheme, detecting more «wrong models» to eliminate them).
As only first-order logic has a clear concept of formal provability, the only available concepts of provability for second-order theories are those of the first-order theories they can be converted to. Stronger conversion methods can eventually give a wider range of theorems than weaker ones.

First-order definable structures

A weak method, which we already used for arithmetic, formally eliminates the type K by letting a variable structure of that type range over all those defined by expressions with variables of base types. It is only applicable to such theories only involving K as the type of variables under ∀ at the root of axioms, and without second-order structure. The method is to replace each such axiom by a schema of axioms, one for each expression which defines that variable.
Alternatively, formally admitting K can provide more means to express and develop second-order theories. Still, the same proving strength is given by the comprehension axioms where all bound variables of F are of base types (none of type K). It makes no difference to admit parameters of type K or not: admitting them eliminates some incomplete interpretations of K but still allows the minimal one : the mere set of all classes (or structures) definable with variables of base types. Indeed any comprehension axiom with parameters in K is satisfied there, as its truth for each tuple of values of parameters (themselves defined by expressions) is ensured by another axiom whose more complex formula is obtained by substitution. The version of this schema accepting such parameters is usually preferred for the elegance of the resulting properties.

Henkin semantics

The full power of first-order logic precisely using the involved types (the base types plus the second-order ones, here K), admits formulas F with bound second-order variables (= of type K) in the comprehension axioms. This requirement on the interpretation of K goes beyond any concept of «definable structures» of its kind on the base types, and cannot be reduced to any property of its elements, since its use of K in defining formulas makes this condition for «being in K» circular.

Set theoretical interpretations

The integration of first-order theories in a set theory, initially described in 1.3 and 1.7, is naturally extensible to second-order theories using the set theoretical powerset and exponentiation operators. This is the most powerful known kind of formalization for second-order logic, giving it the proving power of set theory (seen converted as a first-order theory as explained in the last sections of Part 1), but this power depends on a chosen axiomatic formalization of set theory. As the incompleteness theorem will show, there can be no most powerful algorithmic concept of proof for arithmetic in general, as there can be no most powerful set theory from which such a concept may come in particular: for any good one there is a stronger good one.

Compared with Henkin semantics, this expands the range of expressible comprehension axioms by letting more kinds of variables in their formulas. Namely, comprehension axioms whose parameters may take any value in the universe and bound variables may range inside any set there, are implicitly given by the definition of the set builder symbol in our formalism. These make the interpretations of second-order types appear uniquely determined by those of the base type from the viewpoint of that set theory which sees its own universe as fixed (inside which any bijection on base types between 2 systems is uniquely extensible as an isomorphism of the second-order constructions from them), but a multiplicity of non-isomorphic interpretations re-appears from an external viewpoint where a diversity of non-standard universes is understood possible as shown by Skolem's paradox.

The formalization of set theory we introduced from the beginning to the powerset, the axiom of choice and the axiom of infinity (this essentially is Mac Lane set theory), is quite powerful and the most convenient to comfortably develop all ordinary mathematics. As a little step further, Zermelo set theory adds to this the once mentioned comprehension axioms with bound variables ranging over the universe itself. But much larger leaps in power will come from postulating the existence of more kinds of sets or other objects, starting with what may be abusively written (while ℘ is not a function) as the recursive sequence (℘n(ℕ))n∈ℕ. Indeed the union of this sequence is the smallest standard model of Zermelo and Mac Lane set theories; its existence obliges the existence of still many more sets beyond it (its own iterated powersets). Zermelo-Frankel axiomatic set theory does this, but goes so much further after this that, as we shall see, the structure of the hierarchy of powersets that it requires of its universe, escapes by far any possible imagination or description.

Semantic completeness, logical incompleteness

A second-order theory will be called semantically complete if, from a set theoretical viewpoint, all the models that «correctly interpret the powerset» are isomorphic to each other.
According to the Löwenheim-Skolem theorem, first-order theories describing infinite systems cannot be semantically complete (first-order logic cannot prevent the coexistence of multiple non-isomorphic models that «incorrectly interpret the powerset», and thus would have different second-order properties if interpreted by the external «true powersets»).

A first-order theory will be called logically complete if all its closed formulas are logically decidable (provable or refutable). This is naturally meaningful for first-order theories where the Completeness Theorem makes the concept of provability unambiguous, but not for second-order theories where the rules of proof remain to be specified. However it is still a posteriori meaningful there, in the following sense.

Some semantically complete second-order theories (geometry, arithmetic with only addition and no multiplication) are already logically complete by the weakest method, as all models of this weakest first-order expression are elementarily equivalent (they have the same first-order expressible properties) though they are not isomorphic.
For other theories (full arithmetic, set theory), stronger methods will contribute to the provability of more formulas. However then, such a quest is usually endless: the logical incompleteness of these theories is fundamental in the sense that no fixed algorithm can ever suffice to decide all their formulas either (may this algorithm anyhow come from the rules of first-order logic, e.g. by a translation into some set theory, or not).

Second-order arithmetic

This is the name of the first-order theory we described as "with stable power type" to describe arithmetic. However the list of basic structures needs to be specified.

The following 3 presentations of second-order arithmetic as first-order theories (with axioms lists implicitly determined by the previous explanations) are equivalent (definable from each other):

  1. With types ℕ and ℘(ℕ×ℕ) and with structures 0 and S ;
  2. With types ℕ and ℕ and with structures 0 and S ;
  3. With types ℕ and ℘(ℕ) and with structures 0, + and ⋅ .
Proof :
We get ℕ as the set of functional graphs (subset of P(ℕ×ℕ))
Using ℕ we can define sequences of integers by recursion ; addition and multiplications come as particular cases.
Using addition and multiplication we can define a bijection between ℕ and ℕ×ℕ by (x,y) ↦ x + (x + y)⋅(x + y+1)/2, so that ℘(ℕ) can be used in the role of ℘(ℕ×ℕ).

Second-order arithmetic as a possible foundation for mathematics

The powersets of uncountable infinite sets contribute to convenience but are not not necessary: much of ordinary mathematics can be done without them, just using some more tedious developments instead.

Most of mathematics (with some exceptions such as geometry, which will be discussed later) uses one or both of

With some more work, the "ordinarily interesting" uses of quantifiers on ℘(ℝ) (thus indirectly of ∀ X⊂ ℘(ℕ)) and other uses of P(P(ℕ)) may usually be encoded as elaborate uses of P(ℕ).
In fact, most of ordinary mathematics can be indirectly developed using only ℕ and P(ℕ), that is, in the theory of second-order arithmetic. Further uses of the powerset (the P(P(P(ℕ))) and more) usually appear useless in ordinary mathematics.
Most ordinary mathematics and all physics can be expressed in the theory of second order arithmetic expressed above.
The canonical bijection P(ℕ×ℕ)≅ P(ℕ), allows to define recursive sequences in P(ℕ).
But other encodings can be much more cumbersome (unnatural), so that this theoretical possibility to encode all ordinary mathematics into second-order arithmetic, while possible, is rather far-fetched and not practical as a way to study mathematics.

For example, arbitrary continuous function f from ℝ to ℝ can be represented in P(ℕ×ℕ×ℕ×ℕ), and thus in P(ℕ), by the set of 4-tuples of strictly positive natural numbers a,b,c,d such that f(a/bb/a) > c/dd/c.

Or, as another example of possible encoding, such that f(ab2) > cd2.

Another inconvenient is that, in mathematics, many objects studied by a diversity of useful theories, do no aim to be anything else than pure elements. Choices to build mathematics on a theory where all objects are either a natural number or a set of natural numbers, while being "right" in the fact that every set of interest has an injection into P(ℕ), obscures this purpose.

Set Theory and Foundations of Mathematics