Secondorder logic
Let us continue our exploration of the main
kinds of formalisms for theories:
Secondorder theories (or: theories of secondorder logic)
are theories with one type X, called the base type,
and one or more others as follows, depending on the precise kind of
secondorder logic :
 Full secondorder logic admits several other types that
represent the sets of all structures (operations and relations)
in X with any arity: P(X), X^{ X},
P(X×X), X^{ X×X}, P(X×X×X)...
 Monadic secondorder logic only keeps in this list the
powerset of X (set of all unary relations on X).
More generally from several base types we can consider the sets
of all structures of any given kinds between them (numbers of
arguments of each type, type of results). So the general idea of
secondorder theories is that they can express, for any types X,Y
of objects, formulas of the form «For every subset of X...»,
and thus also (equivalently) of the form «For every function from
X to Y...».
Secondorder theories can be interpreted as firstorder theories,
where each type K of structures receives its role by one
structure symbol of that kind with one additional argument of type
K, as explained in previous sections. Indeed, currying this
symbol gives a function from K to the intended set of all
such structures (the powerset or exponentiation). The obstacle
against the full formalization of secondorder theories as
firstorder theories, is that we cannot generally express the
bijectivity of this function by axioms: its injectivity can be
easily expressed in firstorder logic in the form of the below
separation axiom, but its surjectivity can't.
Weak secondorder theories, separation axioms
In a firstorder theory, a type K will play the role of a set
of structures of some kind with arity n over some
other types (called base types), i.e. an arbitrary subset of the set
of all structures of that kind, if we have
 One structure symbol S with 1 argument of type K,
and n other argument(s) (and values if it is an
operation) among base types.
 The following axiom (or theorem) of separation of K by
S:
 If S is a relation :
 If S is an operation :

∀k,k'∈K, (∀x_{1},...,x_{n},
S(k, x_{1},...,x_{n})
⇔ S(k', x_{1},..., x_{n}))
⇒ k = k'
∀k,k'∈K, (∀x_{1},...,x_{n},
S(k, x_{1},...,x_{n})
= S(k', x_{1},...,x_{n}))
⇒ k = k' 
The separation axiom means that its curried form with domain K
is injective, so that it gives K the role of a set of
operations or relations among base types. Then, the notation S(k,
x_{1},...,x_{n}) may be replaced by (x_{1},...,x_{n})
∈ k if a relation, or k(x_{1},...,x_{n})
if an operation.
Theories with such types (sets of structures) will be called weak
secondorder theories.
Secondorder structures
In a secondorder theory, let us call firstorder structures
the structures (in the sense of firstorder logic: operations and
relations with finite arity) that are directly between base types.
Then the secondorder structures are also structures like
in firstorder logic (operations or relations) but involving one
or more secondorder types (i.e. of structures between base
types).
Let us focus on the case of a unary relation A on a
secondorder type Y. Like any unary relation, it can be
represented by a subset K of Y. But then,
forgetting about the type Y (with its idea that it
contains «all the...»), to only keep the type K (with no
such need of exhaustivity), what remains is a firstorder theory,
as expressed above with the separation axiom.
Of course we can only do it if the rest of Y is not
involved in the theory. Usual interesting theories behave in this
asymmetric way of only involving one side of a unary relation on
some set Y of all structures of a given kind : one set K
of structures, with no use of its complement (Y\K, corresponding
to ¬A). So, most useful theories with a secondorder
structure, are just weak secondorder theories. In other words,
such a secondorder structure is the reinterpretation of a type in
a firstorder theory that satisfies the separation axiom.
Such a type «plays the role of a structure» in the sense of how
we described how a theory is modified by the addition of an extra
symbol (structure) without any extra type :
In a theory, an additional type is constructed if it goes
with enough extra structures (to ensure the uniqueness of the
extension of an isomorphism to the new type) and enough extra axioms
(to ensure that the extension, as if it was uniquely defined, is
determined from base types up to isomorphism, so that the
isomorphism on base types can be extended to it, cancelling the
effect of the extra structures which may forbid the existence of any
extension of the isomorphism if they are not «invariant»).
If the new type only has «enough extra structures», but may not have
«enough extra axioms» so that each isomorphism between base types of
2 systems, has at most one extension (but maybe none) by a bijection
between both versions of the new type, then the case is similar to
that of adding a mere structure not specified by a definition (which
may or may not be preserved by an isomorphism but does not split an
isomorphism into several possible extensions). Then, we can say that
the whole additional stuff plays the role of a structure.
In particular, the separation axiom in a weak secondorder theory
ensures that an extension is «structuring enough» in this sense (but
does not otherwise uniquely determine the extension up to
isomorphism). This is why an extension by a new type with separation
axiom plays the role of a structure, justifying its qualification as
a secondorder structure.
Now, developments of theories by constructions, come with not only
enough structures relating the new type to the old type(s) to let it
play the role of a structure, but also «enough axioms» to «define»
it.
In particular, the last kind
of constructions we listed (of the set of all structures
defined by an expression with all possible values of its
parameters), is a particular set of (noninvariant firstorder)
structures, that is itself an invariant secondorder structure
(welldefined using that formula, and image of the set of all tuples
of values of the parameters). Its property of «invariance by
automorphisms» is expressed as follows : each element of this
constructed type, is a firstorder structure that automorphisms
usually do not preserve (as this firstorder structure is not
invariant), but transform into other elements of that type
(preserving this type as a whole).
Translating secondorder theories into firstorder ones
Provability was only
found welldefined in firstorder logic. To get a concept
(rules, algorithm) of proof for a secondorder theory, we need to
translate it into a firstorder theory, where the rules of
firstorder logic are applied. The problem is, firstorder logic
cannot fully express (determine) the concept of which is the true
powerset of a given set.
But different ways to translate a secondorder theory into
firstorder ones, can express weaker or stronger requirements on the
interpretations of powersets, and thus eliminate more «wrong models»
from consideration. We shall list 3 of them, from the weakest (that
requires «few» subsets, and thus can prove «few» theorems) to the
strongest (that requires «many» subsets and thus can prove «many»
theorems); they will be applied to the case of arithmetic in the
next section.
The firstorder expression by a schema of axioms (weakest
method)
This only applies to theories involving the secondorder type, not
in any structure, but only in axioms starting with a universal
quantifier on it, i.e. «For all sets...» (or «For all
operations...»). The idea is to interpret the variable structure as
«only» ranging over all structures definable with parameters, by an
expression in the firstorder language that only uses the base
type(s).
The method is to eliminate the secondorder type in such an axiom,
by replacing this axiom by a schema of axioms, each with a different
defining expression to replace that variable.
The theory with stable power type (Henkin semantics)
Consider the case of a monadic secondorder theory (for example;
other cases are similar). We shall express it as a firstorder
theory with 2 types X, Y. In addition to the
separation axiom, the idea that Y = P(X)
will be expressed by a schema of axioms, called the comprehension
axioms, expressing that all classes of elements of X
(i.e. subsets of X defined by formulas with parameters in
this theory, as in sections 1.5 and 1.7), belong to Y.
They are written, for any formula F with at least one free
variable of type X:
∀(parameters), ∃A∈Y, ∀x∈X
, x ∈ A ⇔ F(x, parameters).
These axioms are essentially equivalent to those that defined the
setbuilder we introduced for set theory (see 1.8, 1.11), except
that this symbol is not needed (it can be seen as an
abbreviation).
In fact, all comprehension axioms without quantifiers on Y
but only using bound variables in X and parameters in both
X and Y, are satisfied by taking as Y the
set of all classes of elements of X defined with
parameters in X only. (What about those using another
structure on Y in their defining formula ?)
So, what makes this construction more powerful than the previous
one, apart from the ability to use variables of type Y in
structures and other axioms, is the requirement to also admit in Y
all subsets of X defined using quantifiers on variables of
type Y. This goes beyond any concept of «definable subsets
of X», and cannot even be qualified as a property of these
subsets, since its use of Y in defining formulas makes
this condition of definability circular.
Set theoretical interpretation (strongest method)
The expression of any theory inside set theory, initially
introduced in 1.3 for firstorder theories, can be extended to
secondorder theories, interpreting the notions of «all subsets»
and «all operations» by the powerset and exponentiation operators
of set theory. Then the rules of proof are those of set theory
once expressed as a firstorder theory (see 1.9, 1.10, 1.11).
This framework is the one expressing the intended meaning of
secondorder theories as deterministically involving the full set
of all structures of each given kind between given sets: the tools
of set theory ensure that this powerset will include all
structures that «can be found»; they interpret it in a model of
set theory which they implicitly regard as fixed (assuming that we
are «living in it»). However, since set theory is itself a
firstorder theory, this formal appearance of uniquely determining
these powersets is not real but simulated, as follows: from any
bijection between two sets X and X' we can define
the image in X' of any subset of X, which obliges
P(X) and P(X') to conform to each
other. Then, the view of diverse models of secondorder theories
as encompassed in a fixed model of set theory, obliges their
secondorder types to conform to each other when their base types
are in bijection. This correspondence between all considered
models gives the appearance of determination of secondorder types
from base types.
In this context, for a fixed X, any set Y with a
relation between X and Y, can be seen as a
relational system with language the set X of unary
relation symbols. In the category of such systems but only
accepting the embeddings as morphisms, the final objects are
those Y which are copies of the "true powerset" of X.
This method is the most powerful approach, as the available
variables and tools for the defining formulas that can find
subsets of X to bring into P(X), are
extended much beyond the mere sets X and P(X),
up to whatever range of other sets that this model of set theory
may contain. However this is still not any kind of specific «most
powerful formalization» because it is itself not fixed, as it
depends on the choice of an axiomatic system for set theory, while
there actually cannot be such a thing as «the most powerful
axiomatic system for set theory» (we can always find a stronger
one).
Semantic completeness, logical incompleteness
A secondorder theory will be called semantically complete
if, from a set theoretical viewpoint, all the models that «correctly
interpret the powerset» are isomorphic to each other.
According to the LöwenheimSkolem theorem, firstorder theories
describing infinite systems cannot be semantically complete
(firstorder logic cannot prevent the coexistence of multiple
nonisomorphic models that «incorrectly interpret the powerset», and
thus would have different secondorder properties if interpreted by
the external «true powersets»).
A firstorder theory will be called logically complete if
all its closed formulas are logically decidable (provable or
refutable). This is naturally meaningful for firstorder theories
where the Completeness
Theorem makes the concept of provability unambiguous, but not
for secondorder theories where the rules of proof remain to be
specified. However it is still a posteriori meaningful there, in the
following sense.
Some semantically complete secondorder theories (geometry,
arithmetic with only addition and no multiplication) are already
logically complete by the weakest method, as all models of this
weakest firstorder expression are elementarily equivalent (they
have the same firstorder expressible properties) though they are
not isomorphic.
For other theories (full arithmetic, set theory), stronger methods
will contribute to the provability of more formulas. However then,
such a quest is usually endless: the logical incompleteness of these
theories is fundamental in the sense that no fixed algorithm can
ever suffice to decide all their formulas either (may this algorithm
anyhow come from the rules of firstorder logic, e.g. by a
translation into some set theory, or not).
Higherorder theories
Higherorder theories generalize secondorder theories, by having a
list of types that can be enumerated starting from a given list of
base types, and successively making other types by «secondorder
constructions» from previous types, i.e. constructions that may
include not only products (as allowed by the constructions of
firstorder logic), but also powerset («Y is the set of
all subsets of X») and exponentiation («Z is the set
of all functions from X to Y»). Saying that they are
successive constructions, means that there is an order between types
such that each type of objects said to be made of «all the..» comes
after those of which it is the powerset or exponentiation.
Higherorder theories can be translated into secondorder theories
as follows. Take as the list of base types, all types that are used
in any construction of powerset or exponentiation. Then in the
secondorder theory over this list of base types, a good list of
structures and firstorder axioms will express all intended
relations between base types. Namely, the relation Y = P(X)
will be formalized by a structure of bijection between Y and
P(X).
On the double meaning of "invariance"
Basically, the concept of "invariant structure" as "definable by
a formula without parameters" is more restrictive than as
"preserved by any automorphism".
Even a structure that is both invariant and definable with a
parameter, is not necessarily definable without parameter. For
example, in the theory of real numbers, not all real numbers are
definable (since the set of possible definitions is in bijection
with ℕ but the set of all real numbers is not), but Aut(ℝ,
0,1,+,⋅) = {Id_{ℝ}}. The same for sets of integers in the
theory of secondorder arithmetic.
In either case, we have objects not definable by any finite
formula without parameter, but still "definable with a parameter"
(itself), and preserved by automorphisms. They can only be always
"defined" by their infinite sequence of decimals, which is not a
proper definition because it is an infinite amount of data.
There are still 2 possible ways to make both concepts of
invariance somehow coincide :
 Either we choose to work in firstorder logic, in which case
the theory admits nonstandard models which have automorphisms
(that only exist from the outside) which do not preserve some
undefinable real numbers (but exchange them with some infinitely
close neighbors).
 Or we work in secondorder logic, in which case the "solution"
is to admit "secondorder" ways of defining structures : any
subset of ℕ is "defined" by the infinite data of all its
elements. This way to cross the gap between Inv(Aut_{L}(E))
and the set of definable structures by means of extended kinds
of "definitions" (with infinite amounts of data) has been
investigated by Borner, Martin Goldstern, and Saharon Shelah : short
presentation in Postscript format  full article.
Secondorder arithmetic
This is the name of the firstorder theory we described as "with
stable power type" to describe arithmetic. However the list of
basic structures needs to be specified.
The following 3 presentations of secondorder arithmetic as
firstorder theories (with axioms lists implicitly determined by
the previous explanations) are equivalent (definable from each
other):
 With types ℕ and P(ℕ×ℕ) and with structures 0 and S
;
 With types ℕ and ℕ^{ℕ} and with structures 0 and S
;
 With types ℕ and P(ℕ) and with structures 0, + and ⋅ .
Proof :
We get ℕ^{ℕ} as the set of functional graphs (subset of P(ℕ×ℕ))
Using ℕ^{ℕ} we can define sequences of integers by recursion
; addition and multiplications come as particular cases.
Using addition and multiplication we can define a bijection between
ℕ and ℕ×ℕ by (x,y) ↦ x + (x + y)⋅(x
+ y+1)/2, so that P(ℕ) can be used in the role of P(ℕ×ℕ).
The need of the powerset
The ZermeloFrankel axiomatic system of set theory not only accepts
P (ℕ) as a set, but the whole infinite series P (P
(.....P (N)..)) also. And it even goes further after this, to
more infinite sequences of higher and higher powersets. In
fact, the hierarchy of powersets that it requires of its universe,
goes very far beyond
any possible imagination or description.
Most of mathematics (with some exceptions such as geometry, which
will be discussed later) uses one or both of
 The set ℕ of natural numbers. We characterized the theory of
natural numbers as a secondorder theory, i.e. using the
powerset P(ℕ).
 The set ℝ of real numbers. It may be naturally characterized
using its own powerset P(ℝ); but it may also be built
from P(ℕ) (i.e. there is a way to let P(ℕ) play
the role of ℝ).
With some more work, the "ordinarily interesting" uses of
quantifiers on P(ℝ) (thus indirectly of ∀ X⊂ P(ℕ))
and other uses of P(P(ℕ)) may usually be encoded as
elaborate uses of P(ℕ).
In fact, most of ordinary mathematics can be indirectly developed
using only ℕ and P(ℕ), that is, in the theory of
secondorder arithmetic. Further uses of the powerset (the P(P(P(ℕ)))
and more) usually appear useless in ordinary mathematics. Their main
uses may be the following :
 A metatheory interpreting a theory in some universe U
of objects, roughly involves P(U^{ℕ}).
 Without ℕ, either P(P(E)) or (with more
work) E^{E} can be used to express the
finiteness of a set E, and to rebuild ℕ from E
if E is infinite.
Still, while quite less necessary, the powersets of uncountable
infinite sets may be also considered for convenience. It is just
formally simpler in set theory to assume the powerset tool to apply
to all sets, as it is involved in some important concepts
(descriptions of sets and systems).
Ultimately, the precise initial axiomatic system of set theory that
will suffice to comfortably develop all
mathematics (a simple but powerful enough one), will consist of the
components (symbols and axioms) that we mentioned up to the
powerset, plus the axiom of infinity, which equivalently
states the existence of ℕ or of some infinite set (these conditions
will be precisely formalized later, when needed).
Secondorder arithmetic as a possible foundation for mathematics
Most ordinary mathematics and all physics can be expressed in the
theory of second order arithmetic expressed above.
The canonical bijection P(ℕ×ℕ)≅ P(ℕ)^{ℕ},
allows to define recursive sequences in P(ℕ).
But other encodings can be much more cumbersome (unnatural), so that
this theoretical possibility to encode all ordinary mathematics into
secondorder arithmetic, while possible, is rather farfetched and
not practical as a way to study mathematics.
For example, arbitrary continuous function f from ℝ to ℝ
can be represented in P(ℕ×ℕ×ℕ×ℕ), and thus in P(ℕ),
by the set of 4tuples of strictly positive natural numbers a,b,c,d
such that f(a/b − b/a) > c/d
− d/c.
Or, as another example of possible encoding, such that f(a
− b√2)
> c − d√2.
Another inconvenient is that, in mathematics, many objects studied
by a diversity of useful theories, do no aim to be anything else
than pure elements. Choices to build mathematics on a theory where
all objects are either a natural number or a set of natural numbers,
while being "right" in the fact that every set of interest has an
injection into P(ℕ), obscures this purpose.
Set Theory and
Foundations of Mathematics