4. Arithmetic and first-order foundations
4.1. Algebraic terms
Introduction
The concept of algebraic term, namely a term with a purely algebraic language L and a set V
of variable symbols (no predicate, logical symbols nor binders), was first intuitively introduced in 1.5.
We shall now formalize it as a kind of systems in a set theoretical framework. For convenience
let us work with only one type (the generalization to many types is easy), and start
with a wider class of systems we shall call L-drafts.
The formalization of this concept has a few possible variants.
In a first approach, an L-draft is
an L-equational system
(D, b) such that (Gr b ∪ D) is injective
(where b: VD → D and Gr b ∩ D
= ∅). This injectivity means that
- D is injective ;
- b is injective, allowing for the simplification VD ⊂ D
and b = IdVD.
- Im b ∩ Im D = ∅, thus Im b = ∁D
Im D.
The set D means the set of occurrences of symbols, split between the set
Im b of occurrences of variable symbols, and the set Im D of occurrences
of symbols from L. But this definition for drafts would make each variable occur exactly
once, thus cannot directly include algebraic terms, where a variable may have multiple
occurrences or none, as particular cases. The solution is to first replace the use of b
by that of its inverse b', function from a set of occurrences VD ⊂
D to the set V of available variables, then remove from it the bijectivity requirement.
We may keep the injectivity requirement on b', meaning the uniqueness of occurrence
of any variable, as any multiplicity of occurrences of one variable in a term can be replaced by the use of
a single occurrence of it referenced multiple times inside the term (this replacement gives a synonymous "term"
abstractly conceived, ignoring any care for visible notations).
But when multiple variables are available (∃≥2: V), we need to allow for
non-surjective b' (terms leaving some available variables unused), especially to allow as a draft
any term made of a single variable symbol.
Hence the relevance of a formalization where
VD = D∩V and b' = IdVD.
Drafts
Finally, let us formalize an L-draft as an injective L-system (D,
tGr ψD) where:
- Dom ψD = Im D = OD = D\V
is the set of occurrences of symbols from L in D;
- VD =
∁D OD = D∩V is the set of
used variables (or occurrences of variables) in D;
- 〈VD〉L = D (well-foundedness).
Let us denote ψD = σ ⊓ l, i.e. ∀x∈OD,
ψD(x) = (σ(x), lx)
∈ LD where σ∈LOD and
lx∈Dnσ(x).
Equivalent formulations of well-foundedness are
∀A⊂OD, (∀x∈OD,
Im lx ⊂ A∪V ⇒ x∈A)
⇒ A=OD
∀A⊂D, (VD⊂A ∧
D*(LA) ⊂A) ⇒
A = D
∀A⊂D, VD⊂A≠D
⇒ ∃x∈OD\A, Im lx⊂A
∀A⊂OD, A≠∅ ⇒ ∃x∈A,
A∩ Im lx = ∅
A ground draft is a draft with no variable, i.e. VD = ∅. Thus,
ψD: D → LD and
SubLD = {D}.
Variables in a draft may be reinterpreted as constants: extending ψD
by IdVD : VD → V,
forms a ground (L∪V)-draft.
Sub-drafts and terms
Well-foundedness makes uninteresting the range of stable subsets of drafts, to be replaced
by another stability concept : a subset A⊂D will be called a sub-draft of
D (or a co-stable
subset of D) if, denoting OA = A∩OD
and ψA = ψD|OA,
we have (Im ψA⊂ LA), i.e.
∀x∈OA, Im lx⊂A.
Any sub-draft A of a draft D is a draft, i.e. fits the above axioms with fixed V, where the
well-foundedness of A is obviously deduced from that of D by its last formulation.
Like with stable subsets, any intersection of sub-drafts is a sub-draft. Moreover,
any union of sub-drafts is also a sub-draft (unlike for sub-algebras with an operation
with arity >1, which from arguments in different sub-algebras may give a result escaping
their union).
The sub-draft co-generated by a subset of a draft, is the intersection of all sub-drafts
that include it. A term is a draft co-generated by a single element that is its root.
Each element x of a draft D defines a term Tx with
root x, sub-draft of D co-generated by {x}.
Each draft D is ordered by x ≤ y ⇔ x∈Ty.
This is obviously a preorder.
Its antisymmetry, that is the uniqueness of the root of any term, can be proven as follows
(a more elegant proof from more general concepts is in the study of Galois connections):
∀x∈OD, VD ⊂
A={y∈D|x∉Ty} which is a sub-draft
by transitivity of ≤.
x∉A ∴ ∃z∈OD\A,
Im lz⊂A.
A∪{z} is a sub-draft, thus Tz⊂A∪{z}
by definition of Tz.
z∈OD\A ∴
x∈Tz ∴ x = z.
Equivalent values of
x for ≤ give the same A, to which the same z fits.
Thus this equivalence is equality. ∎
Categories of drafts
As particular relational systems, classes of L-drafts form concrete categories.
Between two L-drafts D,E,
f ∈ MorL(D,E) ⇔
(f[OD]⊂OE ∧
ψE ⚬ f|OD=
Lf⚬ψD)
where the equality condition can be split as
σE⚬f|OD
= σD
∀x∈OD,
lf(x) = f⚬lx.
There, coproducts are given by
disjoint union, like with relational systems.
Another concrete category is that of drafts with variables-preserving morphisms, where V
is fixed, and
MorL,V(D,E) =
{f ∈ MorL(D,E) | f|VD
= IdVD}.
This is roughly the category of ground (L∪V)-drafts (reinterpreting variables as
constants, except that a multiply used variable may be figured in the latter
as multiple elements, but only as a multiply targeted single element in the former).
The definition of coproducts is modified there as
L,V∐i∈I Di
= (⋃i∈I VDi) ∪
∐i∈I
ODi
with the structure defined like before.
In the binary case this coproduct between drafts D,E
will be written D ⊔V E.
Products in these categories cannot be directly made of the products of underlying sets.
The description of products in the variables-preserving case is left as an exercise once
understood draft condensation (4.3).
Images of drafts
If D,E are L-drafts and f ∈ MorL(D,E) then
- For any sub-draft F⊂E, f*(F) is co-stable.
- If f[VD] ⊂ VE then Im f is co-stable,
and ∀x∈D,
f[Tx] = T f(x).
Intepretations of drafts in algebras
An interpretation of an L-draft D in an L-algebra E,
is a morphism f∈MorL(D,E). Equivalently,
f|OD=
φE⚬Lf⚬ψD
∀x∈OD, f(x) =
σ(x)E(f⚬lx)
Theorem. For any L-draft D and
any L-algebra E, ∀u∈EV,
∃!h∈MorL(D,E), h|VD
= u|VD.
Proof. For simplicity, let V = VD by restricting u.
Uniqueness was seen with equational systems,
and will be used to prove existence :
S = {A⊂D | V⊂A ∧ Im ψA⊂
LA}
K = ⋃A∈S
{f∈MorL(A,E) |
f|V = u}
∀f,g ∈K, B = Dom f ∩ Dom g
⇒ (f|B∈K ∧
g|B∈K)
⇒ f|B = g|B
⋃f∈K Gr f = Gr h
C = Dom h = ⋃f∈K Dom f
u ∈ K ∴ V ⊂ C ∈ S ∴ h ∈ K
(C∪D*(LC) ∋ x↦ (x∈C ?
h(x) :
φE(Lh(ψD(x)))))
∈ K
D*(LC) ⊂ C
C = D ∎
In other words, every L-algebra E is a module for every L-draft.
This formally justifies the concept of operation
symbol defined by a term and interpreted as an operation in every L-algebra, by
the concepts explained in 3.11 : the role of V-ary operation symbol played by any
element s of an L-draft D (interpreted in each L-algebra E by
∀u∈EV, sE(u) = h(s)
for the unique h∈MorL(D,E) such that
h|VD = u|VD),
preserved by all L-morphisms, is that of the operation symbol defined by the
L-term with root s co-generated by s in D.
4.2. Quotient systems
Quotients of relational systems
Given a relational language L and two L-systems
E and F, a quotient from E to F is a surjective morphism f
: E↠F such that Lf[E] =
F. We also say that the system F is a quotient of E by ∼f.
This condition defines F from E and f,
as the smallest L-structure on F such that
f ∈ MorL(E, F).
For any equivalence relation R on E, the quotient set
E/R receives the L-structure E/R defined as
LR⃗[E], making (R⃗)
a quotient from E to E/R.
Then for any h : E → X,
R ⊂ ∼h ⇒ (h ∈ MorL(E,X) ⇔
h/R ∈ MorL(E/R,X)).
Now if L is an algebraic language, the condition for an f : E↠F
between L-systems to be a quotient, is written
{(Lf(x),f(y)) | (x,y) ∈ E}
= F.
It implies ∀B⊂F, B ∈ SubLF ⇔ f *(B) ∈ SubLE.
Any morphism from a serial system to a partial algebra is a quotient to its image, which is
a stable algebra.
Quotients in concrete categories
The concept of quotient has an equivalent definition expressible for any concrete category C,
similar to the concept of embedding
with sides reversed. This can be explained by being the same concept as embedding in the opposite category, once the
concreteness of the category is re-expressed as a link with the category of sets, independently of
the details of the category of sets, which can thus be replaced by any other category,
namely the opposite category to the category of sets.
So, a quotient from E to F in C is a surjective
morphism f : E↠F such that, ∀CX, equivalently
Mor(F,X) =
{g∈XF | g⚬f ∈ Mor(E,X)}
Mor(F,X) =
{h/f | h ∈ Mor(E,X) ∧ ∼f ⊂ ∼h}
{h ∈ Mor(E,X) |
∼f ⊂ ∼h} = {g⚬f | g∈Mor(F,X)}
Any retraction is a quotient.
Congruences
Given an algebraic language L, a congruence on an
L-system E is an equivalence relation R on E such that, equivalently,
- R ∈ SubL(E2)
- The quotient system E/R is a partial algebra.
- There exists a morphism f : E → F where F is a partial algebra
and ∼f = R.
Proof of equivalence. Let f
: E → F such that ∼f = R, and K = {(Lf(x),f(y)) |
(x,y) ∈ E}.
R ∈ SubL(E2) ⇔
(∀s∈L, ∀(x,y),(x',y')∈sE,
Im(x×x') ⊂ R ⇒ (y,y') ∈ R)
⇔ (∀(x,y),(x',y') ∈ E, Lf(x) =
Lf(x') ⇒ f(y) = f(y')) ⇔
K is functional.
If K = F, this means F is a partial algebra. Otherwise,
f ∈ MorL(E,F) ⇔ K ⊂ F ⇒
(F is a partial algebra ⇒ K is functional).∎
Thus, the quotient of any algebra or other serial system by a congruence, is an algebra.
In this case, a congruence can be essentially described as a binary relation
satisfying the axioms of equality.
Quotients of modules
Not all quotients of modules are modules. But for any algebraic
language L, any L-system E, and any L-equational system
(K, b), where b : V→K and ACV
holds, if E is a b-module (where V is structureless) then its quotient by any
congruence is also a b-module.
Proof. Let f : E ↠ F a quotient to a partial algebra F.
∀u∈FV, ∃v∈EV, f⚬v = u ∧
∃g∈MorL(K,E), g⚬b = v
∴ f⚬g⚬b = u.
Uniqueness was seen before.∎
While this proof only needs f to be a surjective morphism, not precisely a quotient,
in practice this will only be used with quotients. Indeed, f is a quotient for the smallest
structure of F which makes f a morphism; greater structures preserve
existence; to ensure uniqueness we need F to still be a partial algebra, but when
E is left serial this implies that f is a quotient anyway.
Actually the ACV hypothesis is not needed, as infinite equational
systems will be seen equivalent to sets of finite ones, where finite choice applies.
Intersections of congruences
On any L-system E, any intersection of equivalence relations is an equivalence relation ;
and any intersection of congruences is a congruence.
This can be seen as a case of intersection of
stable subsets, as equivalence
relations are the stable relations by some structure, and the same for congruences. But
here is another viewpoint on this:Let ∀i∈I,
fi ∈ MorL(E,Fi), P =
∏i∈I Fi and g = ⊓i∈I
fi ∈ MorL(E, P). Then
∼g = ⋂i∈I
∼fi.
If all Fi are partial algebras then P
is also a partial algebra, which explains that ∼g is a congruence.
Let G = Im g ⊂ P. Like any morphism, g will be a
quotient to its image G if E is serial and P is a partial algebra, but not
necessarily otherwise.
Let ∀i∈I, hi = πi|G
∈ MorL(G,Fi).
Then fi = hi⚬g. As with any composite of
morphisms, if fi is a quotient then hi is also a quotient.
Minimal congruences
The minimal congruence of any L-system E is the intersection of all its
congruences. Let us write it ≃E, or simply ≃ when E is given by context.
It is equality if and only if E is a partial algebra.
Let us call condensate of E its quotient by its minimal congruence, and πE the natural surjection to it:
⤓E = E/≃
πE = ≃⃗E : E ↠ ⤓E
(⤓E, πE) is initial in the category of (X,f) where X is
a partial L-algebra, f ∈ MorL(E,X),
and Mor((X,f),(Y,g)) = {h ∈ MorL(X,Y) |
h ⚬ f = g}
Mor((⤓E, πE),(Y,g)) = {g/πE}
The congruence of E generated by any equivalence relation R is the preimage of
≃E/R by R⃗.
In the category of partial algebras, the coproduct
is given by the condensate of the disjoint union.
Condensates of injective systems
For any injective L-system (E, tGr ψE),
its minimal congruence ≃E is 〈𝛿E〉L,
and ⤓E is injective.
Proof:
ψE = σ × l
F = E2
∴ F = {((σ(x), lx×ly), (x,y)) |
x ∼σ y}
R = 〈𝛿E〉L =
𝛿E ∪ F*(LR) ⊂ F
(*) ∀(x,y)∈R, (x = y ∉ Dom ψE) ∨
(x ∼σ y ∧ Im(lx×ly) ⊂ R).
S = {(x,y)∈F | ∀z∈R⃖(y), zRx}
∀(x ∼σ y),
∀z∈R⃖(y), z ∼σ y ∧
Im(lz×ly) ⊂ R
∴
(Im(lx×ly) ⊂ S ⇒
Im(lz×lx) ⊂ R ⇒ zRx)
𝛿E ⊂ S ∈ SubL F ∴ R ⊂ S
R being reflexive and left Euclidean
is an equivalence relation. So it is ≃E.
Hence by (*) the injectivity of ⤓E.∎
The formula R = 𝛿E ∪ F*(LR)
is equivalent to
R⃗ ∈ MorL(E,℘(E)) ∧ ∀x∈E\Dom ψE,
R⃗(x) = {x}
where φ℘(E)(s,u) =
{x∈OD | σ(x)=s ∧ lx∈∏u}
which determines R when E is a draft.
The formula (*) expresses R
⊂ 𝛿E ∪ F*(LR), and is equivalent
to the conjunction of
- ∀(x,y)∈R, x ∉ Dom ψE ⇒ x = y
- ∀(x,y)∈Dom ψE, xRy ⇒
(x ∼σ y ∧ Im(lx×ly) ⊂ R).
If R is an equivalence relation, 2. says the structure of E/R is injective, while 1.
means it has the same variables as E. Thus in a draft, the equivalence relations
which fit (*) are those of variables-preserving morphisms to some other draft.
Therefore, the minimal congruence
of a draft is its only congruence by which the quotient is a (functional) draft with the same variables ;
it is the preimage of = by any variables-preserving morphism to a functional draft.
4.3. Term algebras
Synonymous terms
In any draft, x ≃ y is the condition for the terms
with roots x and y to be synonymous, i.e. taking the same value
f(x) = f(y) for any interpretation f of the draft in a partial algebra.
Terms in different drafts D, E with the same variables, can be so compared
by interpreting ≃ in their coproduct D ⊔V E.
Equivalently, for any functional draft F,
∀f∈MorL,V(D,F),
∀g∈MorL,V(E,F), ∀x∈D,
∀y∈E, x ≃ y ⇔ f(x) = g(y).
This equivalence comes from !: MorL,V(D
⊔V E, F) ⇌ MorL,V(D,F)
× MorL,V(E,F).
Then, ∀h∈MorL,V(D,E),
∀x,y∈D, h(x) ≃ x ∧ (x ≃ y ⇔ h(x)
≃ h(y))
because ≃⃗E ⚬ h ∈
MorL,V(D, ).
Thus ∀x,y∈D,
the meaning of x ≃ y is the same for any sub-drafts of D which x and
y are respectively considered elements of, such as the terms Tx and
Ty (using h∈MorL,V(Tx
⊔V Ty, D)).
Term algebras
For any set given set V, a V-ary term L-algebra is an L-system
equivalently described as:
- An algebraic L-draft D where VD = V;
-
An injective L-algebra D generated by V where
V ∩ Im φD= ∅ (thus D\Im φD = V);
- A final L-draft in the sense of variables-preserving morphisms MorL,V ;
- An L-algebra with basis V.
As any draft, it is qualified as ground if V = ∅, i.e. (by definition 2.) a
ground term L-algebra is an algebra both minimal
and injective, and thus also bijective; 4. means it is an initial L-algebra.
Intuitively, a V-ary term L-algebra T represents the set of all V-ary
operation symbols possibly defined by L-terms ; any V-ary
s∈L has a copy sT(IdV) ∈ T.
Proof of equivalence. Already 1. ⇔ 2. ⇒ (3. ∧ 4.).
Proof of 3. ⇒ 1. : any final L-draft D is
- Functional : ∃f∈MorL,V(⤓D, D), f
⚬ ≃⃗D ∈ MorL,V(D, D)
∴ f ⚬ ≃⃗D = IdD ∴ Inj ≃⃗D
- Serial : MorL,V(D∪(LD\Dom D), D) ≠∅
- VD = V because V is a draft.
Let us prove 4. ⇒ 2. after reduction to the ground case by replacing variables by
constants (replacing L by L∪V).
Any initial L-algebra (E,φE)
is :
- minimal : ∃f∈MorL(E, MinLE),
f∈MorL(E,E) ∴ f = IdE ∴
MinLE = E
- injective : (F = LE ∧ φF =
LφE) ⇒
φE ⚬ LφE =
φE ⚬ φF
⇔
φE∈MorL(F,E)
∴
∃f∈MorL(E,F),
φE⚬f = IdE
∴ f⚬φE =
φF⚬Lf =
L(φE⚬f) = IdF.∎
Term algebras in injective algebras
For any injective L-algebra (E,φE), any
V ⊂ E \ Im φE, generates a term algebra
〈V〉L ⊂ E. In particular the existence of an injective
algebra implies that of a ground term algebra.
For any f∈MorL(K,E) from a
ground draft K to an injective system E, the system
Im f is both injective and minimal, i.e. a ground draft. As with any morphism,
if K is serial then Im f is serial;
if E is functional then Im f is functional. Thus, if
K is a ground term algebra and E is an injective partial algebra
then Im f is a ground term algebra, and thus f is injective (by essential uniqueness
of initial or final objects).
Precisely to distinguish statements and their respective ranges of validity,
- If K is a serial minimal system and E is an injective partial algebra
then Im f is a ground term algebra;
- If K is a functional ground draft and E is injective then f is injective
(because ∼f ⊂ f*(≃Im f) =
≃K = 𝛿K).
Some more complicated statements can be obtained in non-ground cases by applying the above
to the language L∪V.
Particular languages
If L has no constant then ∅ is a ground term L-algebra.
If L only has constants, then ground term L-algebras are the
copies of L.
If L only has symbols with arity 0 or 1 then every L-term is functional (this may not be
obvious ; we shall not use it later).
Trajectories by transformation sets
For any set of transformations L ⊂ EE, seen as a set of function
symbols interpreted in E, ∀x∈E,
〈{x}〉L = {f(x)|f∈〈L〉{Id,⚬}}
∀X⊂E, 〈X〉L =
⋃f∈〈L〉{Id,⚬} f[X]
= ⋃x∈X 〈{x}〉L
Denoting A = 〈X〉L,
M = 〈L〉{Id,⚬} and K
= {f(x) | (f,x)∈M×X},
the claim is that A = K. As we shall formalize later, both A and K
mean the set of all composites of any number of functions in L, applied to any
x∈X.
Proof of A ⊂ K
IdE∈M ∴ X⊂K
∀g∈L, ∀y∈K, ∃(f,x)∈M×X,
y = f(x) ∧ g⚬f∈M ∴ g(y) =
g⚬f(x)∈K
K
∈ SubL E
Proof of K ⊂ A
L ⊂ {f∈EE| A ∈
Sub{f}E} = End{A}E
∈ Sub{Id,⚬}EE
M ⊂ End{A}E
∀f∈M,
X ⊂ A ∈ Sub{f}E
∴ f[X] ⊂ A. ∎
Free monoids
A free monoid is a monoid with a basis in the category of monoids. Then its basis is
generating (for the same reason as the above case of categories of algebras).
Any unary term L-algebra M (if it exists) is an egg in the category of
L-algebras, thus a monoid acting on all L-algebras with
MorL ⊂ MorM.
If L is a language of function symbols and M is a unary term
L-algebra then L is a basis of M in the
category of monoids.
Proof:
Let us confuse every s∈L with its copy sM(e)
∈ M, so that in any L-algebra E, ∀x∈E,
s⋅x = sE(x).
So L ⊂ M, thus MorL = MorM.
By the above result, 〈L〉{Id,•} = {s•e |
s∈〈L〉{Id,•}} = 〈{e}〉L = M.
By the representation theorem,
any monoid N can be represented as a transformation monoid N ⊂
XX (by taking X = N and reading composition as left action).
Any u∈NL makes X an L-algebra, thus
defines an action f ∈ Mor{e,•}(M,XX)
∧ f|L = u.
Such an f is unique
because 〈L〉{Id,•} = M.
Finally Im f = f[〈L〉{Id,•}] ⊂ 〈Im u〉{Id,⚬}
⊂ N.∎
Conversely, if L is a basis of a monoid M in the category of monoids, then
M is a unary term L-algebra.
Proof: Any L-algebra structure on any set E is uniquely extensible as
an M-set structure. Thus MorM ⊂ MorL, and
(M, e) is an egg in the category with M-morphisms on the class of
L-algebras. Thus
∀x∈E, hx ∈
MorM(M,X) ⊂ MorL(M,X) ∧
hx(e) = x.
To conclude that (M, e) is also an egg in the category of L-algebras, we
deduce uniqueness from
〈{e}〉L = 〈L〉{Id,•} = M.
∎
4.4. Integers and recursion
The set ℕ
An arithmetic is any theory describing the system ℕ of natural numbers.
There are diverse such theories, depending on the choice of a logical framework,
then of a language and axioms. Somehow, the main framework for arithmetic
is second-order logic, but this leads to other versions depending on how second-order
logic itself comes down to other frameworks.
Let us start with the set theoretical arithmetic, most precise arithmetic expressed in
the framework of a strong enough set theory, interpreting second-order quantifiers
by the powerset symbol. This defines "the standard ℕ", or rather its isomorphism class relative to the
given universe (it amounts
to defining the class of "finite sets", which are measurable by their number of elements,
as we shall define in 4.6).
Definition. The set ℕ of natural numbers is a unary term {S}-algebra
where S is a function symbol called the successor ; also expressible as a ground
term {0,S}-algebra where 0 (zero) is a constant symbol.
The existence of such a system is our first expression of the axiom of infinity. Let us formalize
it by inserting arithmetic
into set theory, in the form of 3 constant symbols ℕ, 0, S with the following axioms :
|
0 ∈ ℕ ∧ S : ℕ → ℕ |
: ℕ is a {0,S}-algebra |
(H0) |
∀n∈ℕ, Sn ≠ 0 |
: 0 ∉ Im S |
(Inj) |
∀n,p∈ℕ, Sn = Sp ⇒ n = p |
: S is injective |
(Ind) |
∀A⊂ℕ, (0∈A ∧ ∀n∈A, Sn∈A) ⇒ A = ℕ |
: 0 generates ℕ (induction axiom). |
More constant symbols can be defined from there:
1 = S0
2 = S1 = SS0
and so on.
This completes our axioms of set theory up to the strength
of Mc Lane set theory,
which essentially forms (with the axiom of choice)
the most commonly used foundational theory for mathematics.
It will imply the existence
of term algebras with any language.
This definition of ℕ may be criticized as circular, for involving concepts of algebra
we initially introduced using ℕ as set of possible arities of symbols. But
these concepts may be re-written up to this point without this use, by restriction
to the use of symbols with small arities (0, 1, 2).
Recursively defined sequences
Each n∈ℕ represents a function
symbol Sn defined by a unary S-term and interpreted
in each S-algebra (E,f) as a function f n ∈
EE.
A sequence of elements of a set E, is a family u : ℕ → E.
It is called recursive when it is a morphism for a given
S-algebra structure on E, i.e. defined by choices of
a∈E and f∈EE :
u ∈ Mor{0,S}(ℕ, (E,(a,f)))
⇔ (∀n∈ℕ, un = f n(a))
⇔
(u0 = a ∧ ∀n∈ℕ, uSn =
f(un)).
Intuitively, each f n(a) abbreviates
the term with n occurrences of the function symbol f, and variable a :
f 0(a) = a
f 1(a) = f(a)
f 2(a) = f(f(a)) |
The sequence (f n) is itself recursively defined by
f 0 = IdE ∧
∀n∈ℕ, f Sn = f ⚬ f n.
In particular, f 1 = f and f 2 = f⚬f.
Generally for any f∈EE,
g∈EX, the recursive sequence h : ℕ → EX
defined by h0 = g ∧
∀n∈ℕ, hSn = f⚬hn,
is hn = f n⚬g.
Addition
The basic properties of addition and multiplication in ℕ can be seen as
obvious consequences of the role of natural numbers as measures ("cardinals") of finite sets,
which will be defined in 4.6. But let
us prove them independently here as an exercise.
Like any unary term algebra,
ℕ forms a monoid (ℕ, 0, +), whose actions are the sequences (f n) for
any transformation f.
It is commutative as it is generated
by the singleton {1} (which commutes with itself). Thus the side won't
matter, but conventions basically present it as acting on the right, defining addition as
n+p = Sp(n), or recursively as
n + 0 = n
∀p∈ℕ, n+S(p) = S(n+p)
n+1 = S(n+0) = Sn
The action axiom ∀f∈ EE, ∀n,p∈ℕ,
f n+p = f p⚬f
n is deducible like for any term algebra:
(f n+0 = f n ∧ ∀p∈ℕ,
f n+Sp = f S(n+p)
= f⚬f n+p)
∴
∀a∈E, (p ↦ f
n+p(a)) ∈ Mor(ℕ,(E, f n(a),f)).
The associativity of + (coming as a property of monoids) will be used implicitly, omitting parenthesis.
Inversed recursion and integers
Let -ℕ = {-n | n∈ℕ} (where -n is read "minus n") a copy of ℕ, structured
as a partial {0,S}-algebra where 0-ℕ = -0 and S-ℕ is the
copy of Sℕ-1. As a relational system, it works like ℕ with S
transposed. Thus, it has a unique {0,S}-morphism to any {0,S}-system E
where 0E is a singleton and SE is both
injective and surjective. In particular this holds for {0,S}-algebras (E,(a,f))
where f = SE ∈ 𝔖E, forming the
negative recursion (f n(a))n∈-ℕ
∈ Mor{0,S}(-ℕ, (E,(a,f)))
∀n∈ℕ, f -n = (f -1)n
giving back the notation f -1 for the inverse of f as a particular case.
Let us define the system of integers as ℤ = ℕ ∪ -ℕ assuming ℕ ⋂ -ℕ = {0} = {-0}, with
the union of structures from ℕ and -ℕ. It is easy to see that Sℤ ∈
𝔖ℤ, and that ℤ is a {0,S}-algebra and the coproduct of ℕ and -ℕ in the
category of partial {0,S}-algebras.
Thus, (ℤ,0) is an egg in the category of {S}-algebras (E,f)
where f ∈ 𝔖E. This defines a monoid structure (0,+) on ℤ,
extending the one on ℕ, and acting on each such (E,f) by the unique
(f n)n∈ℤ ∈ Mor{0,S}(ℤ,
(EE, IdE, ⚬⃗(f)). It is also the unique
morphism of monoid from ℤ to EE sending 1 to f.
Moreover,
- +ℤ is commutative.
- ∀n∈ℕ, (-n)+n = 0 = n+(-n), thus
∀f ∈ 𝔖E, {f n | n∈ℤ}
⊂ 𝔖E.
Proof of 1. : ℤ is generated by {-1, 1}, while (-1) + 1 = 1 + (-1) = 0 because f ∈ 𝔖E
⇒ f -1 ⚬ f = f ⚬ f -1 = IdE.
Proof of 2. : from ∀n∈ℕ, n+1 = 1+n comes
∀f∈EE, f Sn =
f⚬f n = f n⚬f
∀f,g∈
EE, g⚬f = IdE
⇒ ∀n∈ℕ, gn⚬f n = IdE
∀f∈𝔖E, ∀n∈ℕ, f n∈𝔖E
∧ (f n)-1 = f -n
Each result can also be deduced from the other:
1.⇒2. : by symmetry between ℕ and -ℕ, (-n)+n ∈ ℕ ⇔
n+(-n) ∈ -ℕ.
2.⇒1. from the commutativity of +ℕ and a
previous result.
Following the French terminology, the subsets ℕ and -ℕ of ℤ will be respectively called the sets of
positive and negative integers.
Multiplication
The following definition and properties of multiplication in ℕ also hold in ℤ (where it is an
extension of the one in ℕ).
Let us define it as n⋅p = (Sp)n(0)
(this choice of sides fits common language, unlike the usual one from the literature,
until the proof of commutativity removes the distinction). Recursively,
∀x∈ℕ, 0⋅x = 0
∀x,y∈ℕ, (Sy)⋅x = (y⋅x)+x
the latter coming as (Sx)Sy(0) =
Sx((Sx)y(0)).
Then generally, ∀f∈EE, f n⋅p
= (f p)n. Proof :
(n⋅p)n∈ℕ ∈ Mor ((ℕ,S),(ℕ,Sp))
(fm)m∈ℕ ∈ Mor ((ℕ,Sp),(EE,f p)) by properties of actions.
(f n⋅p)n∈ℕ ∈ Mor((ℕ,S),(EE,f p))
(f 0⋅p)n∈ℕ = IdE
(f n⋅p)n∈ℕ fits the definition of
((f p)n)n∈ℕ
∎
∀n∈ℕ, n⋅0 = 0 (easy to check by induction).
Right distributivity ∀x,y,z∈ℕ, (x+y)⋅z =
x⋅z + y⋅z comes by induction on y, or as
(Sz)x+y =
(Sz)y⚬(Sz)x.
Left distributivity ∀x,y,z∈ℕ, x⋅(y+z) =
x⋅y + x⋅z comes by induction on x using commutativity of +.
In particular ∀x,y∈ℕ, x⋅Sy = (x⋅y)+x. As
the transpose of ⋅ fits the recursive definition of ⋅, both are equal :
multiplication is commutative.An elegant approach to multiplication is to see (ℕ, 1, ⋅) as the monoid
End{0, +} ℕ. It will be developed later.
4.5. Presburger Arithmetic
First-order theories of arithmetic
After the formalization of ℕ in set theory,
let us review theories of first-order arithmetic, which means first-order theories
describing ℕ as their only type, and interpreting induction as a schema of axioms by second-order universal elimination
(thus with A only ranging over classes
of that theory). First-order theories with 2 types ℕ and ℘(ℕ), called
second-order arithmetic,
will be studied in Part 5.
Unlike the set theoretical framework, first-order logic cannot define recursive sequences for
arithmetic from the mere successor function. In particular it cannot proceed the recursive
definitions of addition and multiplication, as will be clear from the description of non-standard
models (4.9). So, different choices of language give
non-equivalent versions
of first-order arithmetic, with non-equivalent versions of the schema of induction:
-
Bare arithmetic, with the mere symbols 0 and S, is very poor.
- Presburger
arithmetic, detailed below, is arithmetic with addition. It still cannot define multiplication.
- Full arithmetic, named Z1,
admits symbols 0,S, + and ⋅ , and
all axioms of their definitions. By tedious tricks it
can define all recursive sequences, and generally the sequence defined by any condition of the form
un+1 = f(un,n) expressed in its language.
Presburger arithmetic
Its usual presentations have
primitive symbols 0, +, and either 1 or S which are definable from each other (1 =
S0 and Sx =
x+1) ; actually + would suffice to define all.
Among its equivalent
formalizations, a very short one which superficially looks weakest, describes the set ℕ* of
nonzero natural numbers, with symbols 1 (constant) and + (binary operation), and axioms :
∀x,y∈ℕ*,
x + (y+1) = (x+y) + 1 |
(A1) : + is associative on 1 |
∀A⊂ℕ*, (1∈A ∧ ∀x,y∈A,
x+y∈A) ⇒ A=ℕ* |
(Min) |
∀x,y∈ℕ*, x + y
≠ y |
(F) |
The axiom (Min), expressed for simplicity in set theory where it makes 1 a generator of (ℕ*,+),
is then reduced to a first-order axiom schema by second-order universal elimination. It will be
seen equivalent to the axiom of induction, both set theoretically and as axiom schemas.
(A1) is a particular case of the associativity
axiom (As).
Conversely, (A1 ∧ Min) ⇒ (As) :
Let A = {a∈ℕ* | ∀x,y ∈ℕ*,
x+(y+a) = (x+y)+a}. ∀a,b∈A,
∀x,y∈ℕ*, x + (y+(a+b))
= x + ((y+a)+b) = (x + (y+a))+b
= ((x + y)+a)+b = (x+y)+(a+b)
∴ a+b ∈ A.
(A1) ⇔ 1∈A.
(A1 ∧ Min) ⇒ A=ℕ* ∎
(As ∧ Min) ⇒ commutativity, as deduced from 1∈C({1}).
Now take ℕ = ℕ*∪{0} where 0∉ℕ*, to which + is extended as ∀n∈ℕ,
0+n = n+0 = n. This extension is still commutative and associative.
With Sℕ defined as x↦ x+1, obviously
(H0) holds. As x+0 = x ≠ 0,
(F) ⇔ (∀x∈ℕ*, ∀y∈ℕ, x+y ≠ y)
(Inj ∧ Ind ∧ A1) ⇒ (F) because ∀x∈ℕ*,
{y∈ℕ | x+y ≠ y} ∈ Sub{0,S}ℕ .
The proof of the converse (F ⇒ Inj) using other axioms, will come from the study of
the order relation.
Equivalence of axiom schemas
Let (Ind1) be the statement ∀A⊂ℕ*,
(1∈A ∧ (∀x∈A, x+1 ∈A)) ⇒ A=ℕ*.
(Ind1) ⇒ (Ind) and (Ind1) ⇒ (Min) are obvious
(Ind) ⇒ (Ind1) either by using A∪{0},
or {x∈ℕ|x+1∈A} once noted that (Ind) ⇒ Im S = ℕ*.
Proof of (A1 ∧ Min) ⇒ (Ind) in set theory (ignoring our previous definition of ℕ):
∀x∈〈0〉S, +⃗(x) ∈ EndS ℕ
∴ {x+y | y ∈ 〈0〉S} = 〈x〉S
⊂ 〈0〉S
{0,1} ⊂ 〈0〉S ∴
〈0〉S = ℕ ∎
Proof of (As ∧ Min) ⇒ (Ind) directly convertible to first-order logic:
Let A∈Sub{0,S}ℕ, and B
= {y∈ℕ* |∀x∈A, x+y ∈ A}.
∀y,z∈B, (∀x∈A, x+y ∈ A ∴
x+y+z ∈ A) ∴ y+z ∈ B.
(∀x∈A, x+1 ∈ A) ⇔ 1∈B ⇒ ((Min) ⇒
B=ℕ*).
0∈A ⇒ (∀y∈B, 0+y ∈ A) ⇔ B⊂A.∎
So, all possible axiom pairs are equivalent:
(A1 ∧ Min) ⇔ (As ∧ Min) ⇔ (As ∧ Ind) ⇔ (A1 ∧ Ind), and imply commutativity.
The order relation
In Presburger arithmetic,
let us define binary relations ≤ and < as x<y ⇔ ∃z∈ℕ*, y =
x+z
x≤y ⇔ ∃z∈ℕ, y = x+z
(A1 ∧ Ind) implies
- < is transitive
- ∀x,y∈ℕ, x≤y ⇔ (x<y ∨ x=y)
- ∀x∈ℕ*,∃y∈ℕ, x = y + 1
(ℕ is a surjective
{0,S}-algebra)
- ∀x,y∈ℕ, x<y ⇔ x+1≤y
- ∀A⊂ℕ, A≠∅ ⇒ ∃x∈A, ∀y∈A,
x≤y (meaning a schema of formulas)
- ∀x,y∈ℕ, x≤y ∨ y≤x
- ∀x,y,z∈ℕ, x<y ⇒ x+z < y+z
Proofs :
- using (As), x < y < z
⇒ (∃n,p∈ℕ*, z = y+p = x+n+p)
⇒ x < z.
- obvious from definitions;
- by (Ind)
- from 3.
- x≤y ⇒ (x+1≤y ∨ x=y)
0∈{x∈ℕ |∀y∈A, x≤y}=B
∀x∈B, x+1∈B ∨ x∈A
A⋂B = ∅ ⇒ (B = ℕ ∴ A = A⋂B = ∅)
- from 5. with A={x,y}. Or using 3.,
A = {x∈ℕ
| ∀y∈ℕ, x<y ∨ x=y ∨ y<x}
⇒ (0∈A ∧ ∀x∈A, x+1 ∈ A)
- y = x+n ⇒ y+z = x+z+n
∎
(A1 ∧ Ind ∧ F) implies- < is a strict total order, associated
with the total order ≤
- ∀x,y,z∈ℕ, x<y ⇔ x+z
< y+z
- + is cancellative, which gives (Inj) as a particular case.
Proof. (F) gives irreflexivity ; other conditions for a strict total order come from 1. and 6.
There must be one true formula among (x<y),
(x
= y), (y<x), which by 7. respectively imply
(x+z<y+z),
(x+z = y+z), (y+z<x+z).
But only one of the latter can be true, thus each implication must be an equivalence.
Cancellativity on one side extends to the other side by commutativity.
∎The smallest element of any nonempty A ⊂ ℕ is denoted min A.
In the set theoretical ℕ, this order coincides with the order
between terms with either language {S} or {0,S}. (The proof is easy).
Also, for each n ∈ℕ, the set <⃗(n) ⊂ ℕ can be seen as recursively defined by
¬(n<0) ∧ ∀x∈ℕ, n < Sx ⇔ (x = n ∨ n < x)
which can also be written
∀x∈ℕ, x ∈ ≤⃗(n) ⇔ (x = n
∨ ∃y∈ ≤⃗(n), Sy=x).
(Its existence in ℘(ℕ) can be deduced by induction on n,
more simply than other recursions)
Parity
Presburger arithmetic can qualify a number n∈ℕ as even by
∃x∈ℕ, n = x+x; and odd
by ∃x∈ℕ, n=x+x+1.
Obviously, if n is even then n+1 is odd; if n is odd then n+1 is
even thanks to commutativity. Then it can be shown by (Ind) that any n must be either even or odd, and
by also using (Inj) that it cannot be both and that this x is unique.
Taken as primitive with a relevant set of axioms, the order relation can form a variant of first-order arithmetic
more expressive than bare arithmetic, but less than Presburger arithmetic, as it suffices to define S but
cannot suffice to define +
(nor parity) and cannot be defined from S in first-order logic.
Images of recursive sequences
In set theory with axiom of infinity, the image of a recursive sequence u = (f
n(a))n∈ℕ by a transformation f∈XX
is a trajectory Y = 〈a〉f = Im u of an action of (ℕ,0,+) on E.
As + is commutative, Y gets a
monoid structure with identity a and generator f(a), that is a quotient of ℕ.
The injectivity of u (that would make it an isomorphism from ℕ to Y) is equivalent to
the injectivity (H0) ∧ (Inj) of the {0,S}-algebra structure of Y.
If u is not injective, let y = min{z∈ℕ | ∃x<z,
ux = uz}. Then the restriction of u to <⃖(y)
is bijective to Y (its image is Y because it is {0,S}-stable). Then there is a unique
x<y such that ux = uy, and this data of
x and y determine the isomorphism class of Y.
A cycle of a transformation f∈XX, is a non-free trajectory Y
= 〈a〉f for some a∈X, such that,
equivalently (using the above defined x and y),
- x = 0
- Inj f|Y
- f|Y ∈ 𝔖Y
- f y(a) = a
- a∈ Im f|Y
- (f|Y)y = IdY
- ∀b∈Y, Y = 〈b〉f
Then this y, independent of the chosen generator a of Y, is called the period
of Y.
If f is a permutation then every cycle of f is also a cycle of f -1
with the same period.
4.6. Finiteness and countability
To review equivalent forms of the axiom
of infinity, we need to work without assuming it.
Cardinals
A set
A is said to be smaller (or of smaller size or of smaller cardinality) than a set B,
if there exists an injection from A to B. This predicate is a preorder on the class of sets
(as injections form a category on the class of sets).
By the Cantor-Bernstein theorem, the equivalence predicate it
defines on the class of sets coincides with the predicate ≈ of equinumerosity between sets, defined by the
existence of a bijection : A ≈ B ⇔ ∃f∈BA, f : A ↔ B.
Equinumerous sets are said to "have the same cardinal". The cardinal |A| of any set A,
intuitively meant as the equivalence class of A by ≈ in the class of all sets, can be formalized for the modest use
we shall make of that concept, as A itself where any direct or indirect use of "equality" between cardinals
(never mixed with other elements) actually refers to the use of ≈. This practice is a particular case of development
of a theory (here, set theory), as will be detailed in 4.11.
The {0,S}-structure on powersets
Let us give to the class of sets the meta-structure with language Σ = {0,S} first
introduced to define ℕ, defined by 0 = {∅}
S(A,B) ⇔ ∃x∈B, A = B\{x}.
By restriction, this gives a Σ-structure to the powerset ℘(E) of any set E.
The quotient Σ-system
KE = ℘(E)/≈ is functional, surjective and injective, and Dom SKE ∪ {|E|} = KE.
The only unobvious aspect is the injectivity of SKE. It can be checked (simpler than done
elsewhere) by
∀A,B,A',B' ⊂ E, ∀x∈B, ∀x'∈B',
(A = B\{x} ∧ A' = B'\{x'} ∧ f : B ↔ B') ⇒
(A ∋ y ↦ (f(y) = x' ? f(x) : f(y))) : A ↔ A'
because f -1(x') ∈ A ⇔ f(x) ≠ x' ⇔ f(x)∈ A'.
Another presentation of the proof starts with
∀A,B,B' ⊂E, ∀x∈B,
(A = B\{x} ∧ f : B ↔ B' ) ⇒
f|A : A ↔ B'\{f(x)}
then deduces ∀B⊂E, ∀x,y∈B, B\{x} ≈ B\{y} using
the transposition (x y)B if x≠y.∎
If F⊂E then the natural injection from KF to KE is a Σ-embedding
(like any injective morphism from a surjective system to an injective one). Let us abusively treat these natural injections as inclusions
(KF ⊂ KE), which means to let these sets play the role of sets of cardinals. For any family
of considered sets, seeing them as subsets of their union lets us proceed the study of their cardinals by this tool.
Finite cardinals
For any set E, the set MinΣ KE
is called the set of finite cardinals from KE.
This predicate of finiteness for cardinals in KE is independent of E. Proof:
For any sets F⊂E we have KF ⊂ KE thus
MinΣ KF ⊂ KF ⋂ MinΣ KE.
Conversely, KF ⋂ MinΣ KE is surjective (as both KF
and MinΣ KE are surjective subsets of the injective KE),
thus a sub-draft of the ground Σ-draft MinΣ KE, and thus itself a ground Σ-draft :
KF ⋂ MinΣ KE ⊂ MinΣ KF.∎
Any infinite cardinal is greater than any finite one. Proof:
If E is infinite then MinΣ KE ⊂
Dom SKE
thus MinΣ KE is a ground term Σ-algebra, i.e. a copy of ℕ, thus contains all finite cardinals.∎
Axiom of infinity
In set theory with the powerset, the
following statements are equivalent expressions of the axiom of infinity:
- There exists an injective algebra whose language has at least a constant and a
non-constant.
- There exists a non-surjective injection of a set to itself; equivalently an
injective {0,S}-algebra, or a set ℕ.
- There exists an infinite set.
- For any countable algebraic language there is a (countable)
injective algebra
- Any consistent first-order theory with a countable language
has a (countable) model
4. ⇒ 2. ⇒ 1.; for 1. ⇒ 2. : from a non-constant operation s there,
x↦s(k↦x)
is an injective, non-surjective transformation.
We shall prove 2. ⇔ 3. below (finite cardinalities), and 2. ⇒ 4. ⇒ 5. in the section on the
completeness theorem.
We may regard 5. ⇒ 4. as relative to the natural assumption that the concept of injective
algebra forms a consistent theory. This theory is essentially the part of set theory which describes
tuples. So, if a language L is
included in a model of that simple part of set theory, then there
exists an injective L-algebra. If such a model U is standard, this takes the form
LU ⊂ U, making (U, IdLU)
an injective L-algebra. ∎
The infinity of ℕ easily implies the uniqueness of the
number of elements of a finite set, as its non-uniqueness would give that set a strict injection
into itself, and thus an injection of ℕ into it.
2. ⇒ 3. will come from the infinity of ℕ.
For 3. ⇒ 2. we shall pick a different set, as the existence of an injective
{0,S}-algebra structure on any infinite set would need the axiom of choice.
More properties of finite sets
Between finite cardinals, the order as cardinals coincides with the natural order in ℕ.
If E is finite, e∈ Min KE, and KE is a ground term with
root e representing the number of elements of E
The set of finite subsets of E can be equivalently defined as MinΣ ℘(E).
A set is finite if it is a finite subset of itself (otherwise it is infinite).
For any sets E⊂F, (E is finite) ⇔ (E is a finite subset of F).
Equivalently, a set E is finite if it has a bijection to the graph of S in
a ground Σ-term n (which can be seen as a unary {S}-term). This may
as well be seen as a bijection to its domain {0,...,n-1} or to its image {1,...,n}.
Assuming the axiom of infinity, the finiteness of a set E is equivalent to the existence of a
number n∈ℕ and a bijection from
Vn = {x∈ℕ | x<n} to
E; or a surjection from Vn to
E, which is then bijective precisely when n takes its smallest value
(the number of elements of E).
Without ℕ, either ℘(℘(E)) or (with more
work) EE can be used to express the
finiteness of a set E, and to rebuild ℕ from E
if E is infinite.
Any left cancellative element of a finite monoid is invertible.
Thus any finite cancellative monoid is a group.
Proof: it acts as an injective transformation of
a finite set, thus a permutation, by which the preimage of e is a right inverse, thus an inverse.
Countable sets
It is easy to check that any condensed ground
Σ-draft is either ∅ or a ground Σ-term or a ground term
Σ-algebra (only two other languages have this property, apart from those with no constant).
A set is called countable if, equivalently,
- It has an injection to a condensed ground Σ-draft
- It has a bijection with a condensed ground Σ-draft
- A condensed ground Σ-draft has a surjection to it
1.⇒2. by restricting the order, once condensed ground {0,S}-drafts are described
from their order structure; another method is given below.
2.⇒(1.∧ 3.) ; 3.⇒1. without AC, by picking the smallest element of each preimage.
So, countable sets are either finite or in bijection with ℕ. The latter, on which there exists a
possible structure of ground term {0,S}-algebra, will be called countably infinite.
However we still need to prove that they are indeed infinite.
Countability of ℕ×ℕ
y\x
|
0
|
1
|
2
|
3
|
4
|
0
|
0
|
1
|
4
|
9
|
16
|
1
|
2
|
3
|
5
|
10
|
17
|
2
|
6
|
7
|
8
|
11
|
18
|
3
|
12
|
13
|
14
|
15
|
19
|
A natural bijection between ℕ×ℕ and ℕ is defined by
(x,y) ↦ (y < x ? x⋅x + y : y⋅y +
x + y).
y\x
|
0
|
1
|
2
|
3
| 4
|
0
|
0
|
1
|
3
|
6
| 10
|
1
|
2
|
4
|
7
| 11
|
2
|
5
|
8
| 12
|
3
|
9
| 13
|
Another bijection
b2 :
ℕ×ℕ → ℕ
is defined as
b2(
x,
y) =
Tx+y+
y
from the sequence (
Tn) of
triangular numbers
defined by
2⋅Tn = n⋅(n+1) (see
properties of parity) or equivalently by
T0=0 ∧ ∀
n∈ℕ,
Tn+1=
Tn+
n+1.
This makes ℕ×ℕ a ground term (0,
S)-algebra where
0=(0,0)
S(0,y)=(y+1,0)
S(x+1,y)=(x,y+1)
which recursively defines b2-1(z), where
x+y is the only n such that
Tn≤z<Tn+1.
Countability of finite sequences of integers
If there is a surjection f from ℕ to a set E then E is countable, as an inverse
injection can be defined by associating to each x∈E its smallest preimage by f.
Fixing a bijection b2 from ℕ2 to ℕ,
for any integer n>0 we can recursively define a
bijection bn:ℕn→ℕ, for
example:
b1=Idℕ
∀n>0,∀u∈ℕn,
bSn(u) =
b2(bn(u),un)
(which gives back b2 as the particular case of
bn for n=2).
Also the set ℕ(ℕ) = {u∈ ℕℕ |
∃k∈ ℕ, ∀n>k, un=0} is countable.
There are different ways to see it:
- A sequence of bijections hn : ℕ↔
En, gives a surjection ∐n∈ℕ
Ei ≈ ℕ×ℕ →
U=⋃n∈ℕ En
so that taking En=ℕn we find that
U=ℕ(ℕ) is countable.
- With the decomposition in prime numbers
- As the binary representation maps ℕ to the set 2(ℕ) of finite
subsets of ℕ, it also maps ℕ(ℕ)
to the set 2(ℕ×ℕ) of finite subsets of ℕ×ℕ.
But this construction of ℕ(ℕ)) and similarly simple candidate
expressions of H as a countable set, assumes recursion, thus cannot be
used to define recursion by the above method. To actually provide a definition of
recursion that does not assume it, requires a different expression of an enumerated
H. The representation of sequences which was used by Godel
as a basis to define recursion, actually forms a different set.
4.7. The Completeness Theorem
Existence of countable term algebras
For any countable algebraic language L with at least a constant and a
non-constant, finding a countable ground term L-algebra (necessarily infinite)
amonts to defining a ground term L-algebra structure on
ℕ, which is a bijection from Lℕ to ℕ.
Splitting L as C∪D where C is the set of
constants, and D is the rest of symbols, we have Lℕ
= C∪(Dℕ). But C and D
are countable (either finite or infinite), and Dℕ is a union
over D of disjoint sets with explicitly definable bijections with ℕ
(such as the bns we saw).
In any case, a bijection between C∪(Dℕ)
and ℕ can be found without problem.
In practice, such bijections from Lℕ to ℕ
happen to be ground term L-algebra structures because
∀(s,x)∈Dℕ,
∀i<ns,
xi<s(x) which would be contradicted
by the smallest element outside MinL ℕ.
∎
Interpretation of first-order formulas
Trying to extend the formal construction of
the interpretation of terms in algebras, to the case
of formulas interpreted in systems, the difficulty is to cope with the
interpretation of quantifiers (or generally binders, if we wish to still generalize).
A possibility, is to switch to a view over all variables as bound throughout the formula:
from the concept of interpretation hv∈ET of a term T
in an algebra E for every interpretation v of its set V of variables, the
family (hv)v∈EV
is re-curried into a function from T to EEV.
So, a first-order formula interpreted in a system E can be understood as a term
interpreted in an algebra with maybe two base types : the sets OpE and
RelE of all operations and all relations in E; or split as two sequences
of types, OpE(n) and RelE(n).
We might not need the full sets of these, but at least, an algebra of these (a subset stable
by all needed logical operations).
We took unions over ℕ for the case we would need to see "all possible formulas" as terms
interpreted in one same algebra.
The Completeness Theorem
Let us finally prove the Completeness theorem previously
commented in 1.1, 1.9, 1.10, 1.B and 1.C.
Theorem. First-order logic has a proving system both sound and complete
in the following equivalent senses
- Any consistent
first-order theory T with countable language
has a countable model.
- Any formula true in all countable models of such a theory is
deducible from its axioms.
Sketch of proof of the first statement (implicitly suggesting a possible form of a proving
system, but ignoring efficiency issues; the axiom of choice is not used and is anyway out of topic):
Reduce T to a single-type theory T1 simulating the use of types
by relevant unary predicate symbols and axioms, but without requiring all objects to belong to some type.
Thus, the axiom (∃x, 1) can be added to T1 without harm (it still
allows types to be empty).
Then re-write axioms of T1 in prenex
form, that is chains of
quantifiers applied to quantifier-free formulas (using equivalences
from 2.5 and 2.6 simplified for a nonempty model).
Interpret the symbol = in axioms as an ordinary predicate
symbol, with the axioms of
equality [1.11].
Replace each occurrence of ∃ in an axiom by the use of an additional
operator symbol K, for example
(∀x,x', ∃y, ∀z,
A(x,x',y,z))
⇢ (∀x,x',∀z,
A(x,x',K(x,x'), z)).
Let M be the ground term algebra over the language of
operator symbols enriched as just described. Reinterpreting all ∀
in axioms as the use of axiom schemas (one axiom for each replacement of variables by a
tuple of elements of M), gives a propositional
theory T2 (its axioms are composed of logical
connectives over Boolean variables), whose set of Boolean
variables is PM where P is the set of
predicate symbols in the theory plus the "equality" symbol.
Observe that T2 is still consistent, as any contradiction
in T2 (= finite set of axioms not satisfied by any tuple
of Boolean values of their variables) would provide a
contradiction in T1, as follows:
From all subterms occurring in used axioms of T2, list without repetition all those
whose root is a symbol S which was added to the
language when converting some axiom
∀x,∃y,∀z,∃u,
A(x,y,z,u)
of T1, into the axiom ∀x,z, A(x,K(x),z,
S(x,z)) in T2. Successively replace them all by variables
in T1: for terms
S(t0,t1)
and S(t0,t2)
(where t0, t1, t2
∈ M), the reasoning in T1 will say
"Let y such that ∀z,∃u,
A(t0,y,z,u);
let u1 such that
A(t0,y,t1,u1);
let u2 such that
A(t0,y,t2,u2);"
and use the contradiction in T2 where y replaces
K(t0), u1 replaces
S(t0,t1) and u2
replaces S(t0,t2).
|
Recursively by a chosen enumeration of PM as a countable set,
add one by one to the axioms, each of these Boolean variables
if it is consistent with previously accepted axioms, so that all
get values without contradiction.
Then the quotient
of M by the equivalence relation of "equality", forms a model (with the "true
equality"). ∎
As this construction depends on an arbitrary order between formulas, different
choices give different models, which may be non-isomorphic and even have different
properties reflecting the undecidability of the theory's formulas.
Deduction of the second statement from the first :
T⊢ A
⇔ T ∪ {¬A} is inconsistent
⇔ T ∪ {¬A} has no countable model
⇔ A is true in all countable models of T. ∎
Skolem's Paradox
Comparing the results of the completeness theorem with Cantor's theorem gives this paradox :
If set theory is consistent then it has countable models, though
it sees some sets there, such as ℘(ℕ), as uncountable.
- A countable model U interprets the powerset "℘(E)" of an
infinite "set" E∈U as the "set" P of all objects in U which play
the role of subsets of E (but may differ from these when U is non-standard); as P
is meta-countable, it cannot exhaust the uncountable meta-powerset of "all" subsets of
E.
- As P is meta-countable, it externally has a bijection with ℕ or U;
but such bijections "do not exist" (have no representatives) as objects in U.
Rigorously speaking, when the powerset
axiom says that the class of subsets of E is a set written ℘(E), it
can only determine the interpretation of the functor ℘
relatively to (depending on) the universe where its open quantifiers are
interpreted: «all subsets
of E existing in this universe are in ℘(E)».
In the other interpretation
of what it means for a class to be a set, this would mean that the universe
already contains "really all" subsets of E, forming the supposedly
"true" set ℘(E) which will stay fixed in any further expansion
of the universe. However, for the powerset of an infinite set, this wish
cannot be expressed in first-order logic nor any other conceivable
mathematical formalism:
there is no way to talk about meta-sets that "cannot be found" in this universe
but would «exist elsewhere» (in any mysterious bigger universe) to exclude them
from there.
(Only one aspect of this idea can be formalized as the axiom schema
of specification).
4.8. More recursion tools
Rebuilding recursion
Let us work in a first-order theory just assumed to have 3 types ℕ, E and
H⊂Eℕ,
the language and axioms of arithmetic for ℕ with the schema of induction over expressible
subsets of ℕ, and the axiom that H provides "all finite sequences"
∀n∈ℕ, ∀u∈H, ∀y∈E, ∃v∈H,
vn=y ∧ ∀k<n, vk =
uk
The following existential result has a simple proof by induction making no use of uniqueness:
∀R⊂ℕ×E2, ∀n∈ℕ,
(∀i<n, ∀x∈E, ∃y∈E, R(i,x,y))
⇒ ∀a∈E,
∃u∈H, u0=a ∧ ∀i<n,
R(i,ui,ui+1).
Its simplicity only goes for this case, only generalizable to conditions of the form
R(n,(uk)k<n,un), while the case of algebraic terms is also true but more difficult to prove.
As a particular case, comes the finite choice theorem written as
∀R⊂ℕ×E, ∀n∈ℕ,
(∀i<n, ∃y∈E, R(i,y))
⇒ ∃u∈H, ∀i<n,
R(i,ui).
With f∈Eℕ×E the restriction of such u to
numbers ≤ n is also unique by
induction, from which the x∈Eℕ such that
x0=a ∧ ∀n∈ℕ, xn+1 =
f(n,xn), can be defined by its graph
{(n,un) | (n,u)∈ℕ×H,
u0=a ∧ ∀i<n,
ui+1 = f(i,ui)}
As H contains any finite sequence expressible in the theory, this definition
of recursion turns out to be "the recursion which the theory can express"
independently of the particular choice of H.
The construction of ℕ(ℕ) from the last section, and similarly
simple candidate expressions of H as a countable set, assumes recursion, thus
cannot be used to define recursion by the above method. To actually provide a definition of
recursion that does not assume it, requires an independent expression of an enumerated
H, which is harder but finally possible. Such a construction was achieved by Godel
as part of his work to prove the incompleteness theorem.
Another proof of recursion
By seeing morphisms as subalgebras, we can write another
construction of recursive sequences u∈Mor(0,S)(ℕ,(E,a,f)),
as follows.
Let M be the minimal subalgebra of ℕ×Ea,f,
and let A={n∈ℕ | ∃!x∈E, (n,x)∈M}.
As M is a minimal (0,S)-algebra, M = {(0,a)}∪
Im SM.
Substituting this into the definition of A we get
∀p∈ℕ, p∈A ⇔ (∃!y∈E,
(p=0 ∧ y=a)∨∃(n,x)∈M,
(p=Sn ∧ y=f(x))).
From 0 ∉ Im S we get 0∈A, and
∀n∈ℕ, Sn∈A ⇔ ∃!y∈E,
∃(n',x)∈M, (Sn=Sn' ∧ y=f(x)).
From the injectivity of S we get
∀n∈ℕ, Sn∈A ⇔ ∃!y∈E,
∃x∈E, (n,x)∈M∧ y=f(x).
Thus (∀n∈A, Sn∈A), so that A
= ℕ, i.e. M is the graph of a function u ∈ Eℕ.
As M ∈ Sub(ℕ×Ea,f), we
conclude u ∈ Mor(0,S)(ℕ,Ea,f).
Injectivity lemma
If E is a surjective partial L-algebra and
F is an injective L-system then ∀f ∈MorL(E,F),
- A = {x∈E | ∀y∈E, f(x) =
f(y) ⇒ x=y}
∈ SubLE.
- B = {y∈F | !: f •(y)} ∈ SubLF
Thus if more precisely E, F are algebras,
{y∈F | ∃!: f •(y)} ∈ SubLF.
While they are almost the same, let us write separate proofs.
The condition
f ∈MorL(E,F) is read Im f ⊂
Dom ψF ∧
Lf|Dom φE =
ψF⚬f⚬φE.
- ∀x∈LA⋂Dom φE,
∀y∈E, ∃z∈φE•(y),
f(φE(x)) = f(y) ⇒ Lf(x)
= ψF(f(φE(x))) = ψF(f(y))
= ψF(f(φE(z))) = Lf(z)
⇒ x = z ⇒ φE(x) = y.
- ∀y∈F*(LB),
ψF(y) ∈ LB ∴ (!z∈Dom
φE, ψF(y) = Lf(z)
= ψF(f(φE(z))))
∴
!x∈Im φE=E, y = f(x).∎
A more general form of recursion
Some useful sequences need recursive definitions where the term defining
uSn uses not only un but also n
itself. Somehow it would work all the same, but trying to directly adapt to this case
the proof we gave would require to define some special generalizations of previous
concepts, and specify their resulting properties. To simplify, let us proceed another
way, similar to the
argument in Halmos's Naive Set Theory, but generalized.
For any algebraic language L, let us introduce a general concept of "recursive condition"
for functions u : E → F, where, instead of a draft, E is first
assumed to be an L-algebra (then a ground term algebra to conclude).
The version we saw was formalized by giving the term in the recursive definition as an
L-algebra structure on F, φF: L⋆F → F,
then expressing the request for u to satisfy this condition as u∈Mor(E,F),
namely ∀(s,x)∈L⋆E,
u(sE(x)) = φF(s,u০x).
Let us generalize this as u(sE(x))
= φ(s,x,u০x) which by the canonical bijection
Dom φ ≡ ∐s∈L Ens
×Fns ≡ ∐s∈L
(E×F)ns =
L⋆(E×F) can be written using h : L⋆(E×F)
→ F such that ∀(s,x,y)∈ Dom φ,
φ(s,x,y) = h(s,x×y), asu(sE(x))
= h(s,x×(u০x)).
As ∀u∈FE, x×(u০x)
= (IdE×u)০x, this becomes
the second component of the formula IdE×u ∈ Mor(E, E×F)
when giving E×F the structure φE×F =
(φE০πL)×h.
The first component (φE০πL) we give to φE×F,
makes π∈ Mor(E×F, E) and makes tautological the first component
of the formula IdE×u
∈ Mor(E, E×F), namely
IdE(sE(x)) = φE(s,x)
= (φE০πL)(s,x×(u০x)).
It is then possible to conclude by re-using the previous result of existence of interpretations:
If E is a closed term L-algebra then
∃!f ∈ Mor(E, E×F), which is
of the form IdE×u because
π০f ∈ Mor(E, E) ∴ π০f = IdE.
But one can do without it, based on the following property of this L-algebra E×F:
∀u∈FE, IdE×u
∈ MorL(E, E×F) ⇔ Gr u
∈ SubL(E×F)
Indeed the defining formulas of both sides coincide. To see it otherwise,
- ⇒ is a case of image of an algebra by a
morphism, Gr u = Im (IdE×u).
-
For the converse, the inverse of the bijective morphism π|Gr u
∈ MorL(Gr u, E)
is a morphism IdE×u ∈ MorL(E, Gr u)
⊂ MorL(E, E×F).
This reduces the issue to the search of subalgebras of
E×F which are graphs of functions from E to F.
Now if E is a ground term L-algebra then M =
MinL(E×F) is one of them because
π|M∈ MorL(M, E)
from a surjective algebra to a ground term algebra
must be bijective.
Any other subalgebra of E×F must include M, thus to stay functional it must equal M. ∎
4.9. Non-standard models of Arithmetic
Standard and non-standard numbers
Since the induction axiom
needed to determine the isomorphism class of ℕ depends on
℘(ℕ), the relativity of ℘ allows a diversity of non-isomorphic
models of arithmetic respectively defined by the diverse models of any consistent
first-order theory T which contains arithmetic. Many theories may be
picked in the role of T, such as a set theory seen as a first-order theory, thus
involving other notions than numbers (to define more sets of numbers than
arithmetically defined classes). All we shall assume here is that it has a notion N
of "natural number" with a constant 0, a function symbol S, and axioms
making N a model of bare
arithmetic (in any model of T), thus a bijective {0,S}-algebra.
A number in N is called standard if it belongs to the
minimal
subalgebra N0 = Min{0,S}N,
isomorphic to ℕ by the unique {0,S}-morphism
(embedding,
n↦SNn(0N))
from ℕ to N. So, it represents an element of ℕ,
which may be called a meta-number.
A model N of arithmetic is called standard if it
is a minimal {0,S}-algebra, which can be equivalently written in 3 ways:
- It satisfies the full induction axiom (in terms of powerset);
- It is isomorphic to ℕ (also called the standard
model of arithmetic);
- All its elements are standard.
We shall abbreviate some works of
first-order logic by set theoretical notations as follows :
- Each meta-number n∈ℕ will be confused with the ground term
"Sn(0)" and its value as a standard number;
- The quantifier "∀n∈ℕ" at the root of a formula, means to declare a schema of one formula per value of
n in ℕ;
- For each n∈ℕ, the quantifier ∀x<n (resp.
∃x<n) can equivalently be read using x as an object (∀x,
x<n ⇒), or as a string of conjunctions (resp. disjunctions)
of n copies of the sub-formula with each value of x as a
meta-number lower than n.
Existence of non-standard models
The non-standardness of N is expressible by axioms on an additional
structure, forming an extended theory T'. One way is to add -
A constant symbol k with range N (which may need the
axiom k∈N);
- The schema of axioms n ≠ k for each
n∈ℕ, making the number k non-standard.
If T is consistent then T' is also consistent (and cannot prove more),
as any finite list of axioms of T' can be satisfied in any model of T
(by interpreting k as a different number). Thus,
T' also has
a model, which is a model of T where N is non-standard.
This means that no consistent extension of arithmetic can ever force its
model of arithmetic to be standard, i.e. require induction to apply to the full
powerset of N, which it cannot ensure to exhaust by its classes.
Candidate "standardness" predicates may be introduced but still cannot be
forced to coincide with the true standardness (N0),
which may stay a meta-set beyond all classes.
Any model of arithmetic obtained as described in the proof of the completeness
theorem from any consistent extension of arithmetic, will necessarily be non-standard,
as it cannot even be elementarily
equivalent to ℕ according to the truth undefinability theorem.
Skolem's paradox
still holds in two ways:
- the meta-countable set P interpreting "℘(ℕ)", whose elements serve as
subsets of N, still cannot exhaust ℘(N) which is also meta-uncountable
(on the meta level, bijections between sets N and ℕ induce bijections
between their powersets, which preserve uncountability);
- the image of P projected to ℘(ℕ) by taking preimages by the embedding
from ℕ to N,
does not fill ℘(ℕ) either.
Non-standard models behave much like standard ones, as they satisfy
all consequences of induction. Their precise range of diversity depends on
the chosen theory of arithmetic.
Non-standard models of bare arithmetic
The only "addition" definable from S, is the meta-operation adding any
n∈ℕ to any x∈N as
x + n = Sn(x), that is the meta-sequence of functions
(Sn)n∈ℕ
each defined in the theory by a unary S-term n.
The consequences of the induction schema in bare arithmetic can be summed up as
- The bijectivity of N as a {0,S}-algebra, which
implies the bijectivity of S on the set of non-standard numbers
- The schema of formulas
∀n∈ℕ*, ∀x∈N,
x + n ≠ x which is a weak version of (F).
As the restriction of S to the meta-set N\N0
of non-standard numbers of any model is a permutation, N\N0
is a ℤ-set.
The last formula obliges this ℤ-set to be free. Conversely, the disjoint
union of ℕ with any free ℤ-set, forms a model of bare arithmetic.
Applying both ways the definition of the order from the partial meta-operation
of addition, gives two ordering results with non-standard numbers:
- Each ℤ-orbit of non-standard numbers has its own total order.
- Each non-standard number is greater than each standard number, thus
may be called «infinitely big».
Models of arithmetic
with order, are formed as models of bare arithmetic with any
choice of a total order on the partition of non-standard numbers in ℤ-orbits.
Thus, bare arithmetic cannot suffice to define the order. But arithmetic with
order cannot suffice to define addition either, as its non-standard models
may either admit no or many corresponding interpretations of addition.
Non-standard models of Presburger Arithmetic
Such models satisfy all theorems
of Presburger Arithmetic. In particular they have a well-defined total order, by
which any non-standard number is greater than any standard number,
and any non-empty class of numbers has a smallest element
(which the meta-set of non-standard numbers hasn't).
Here are some more details only for the sake
of illustration, which may be skipped.
There is also a meta-operation (sequence of functions) of multiplication of an
x∈N by a standard number n∈ℕ : n⋅x =
x+...+x (with n occurrences of x)
Beyond commutativity, associativity and the seen properties of the order, the last independent
consequence of the axiom schema of induction constraining
non-standard models, is the possibility of Euclidean division by any nonzero standard number
(generalizing the results on parity) :
∀d∈ℕ*, ∀x∈N, ∃q∈N,
∃r<d,
x = d⋅q + r
thanks to ∀d∈ℕ*, ∀q∈N, d⋅(q+1) =
d⋅q+d which is a schema of theorems in Presburger arithmetic.
Moreover this (q,r) is unique; q = x:d∈N
is called the quotient and r is called the rest of the division of an
x∈N by a d∈ℕ*, so that
∀q∈N,
q=x:d ⇔ d⋅q ≤ x< d⋅(q+1) ⇔ ∃
r<d, x = d⋅q + r.
The concept of model of Preburger arithmetic generated by a set
(of non-standard numbers, since standard ones have no generating effect), is defined as
that of subalgebra generated by a subset
for the "algebra" (not exactly an algebra, but...) with language completed, like in the proof of
the completeness theorem, by additional operation symbols reflecting all existence properties
deduced from the axiom schema of induction:
- The subtraction of a number by a lower number (or the absolute value of subtraction),
which was implicit in the definition of the order and
the proof that it is total;
- The sequence of functions (x↦ x:d) of Euclidean division
by each d∈ℕ*.
They can be conceived either in the abstract (by evaluating relations arbitrarily,
like in the proof of the completeness theorem) or as a subset of a given model
(interpreting expressions there).
In particular, in any model of Preburger arithmetic generated by a single element
(non-standard k∈N), the set of non-standard elements is the set of
all values of expressions of the form (a⋅k):d +b
where a,d ∈ℕ* and b∈ℤ (the cases where a,d
are relative primes suffice).
The predicate of divisibility of x∈N by a d∈ℕ*, is defined as the
case when the rest cancels:
d|x ⇔ (∃q∈N, x = d⋅q)
The possible shapes of these models with respect to k (classes of isomorphisms
preserving k, described using it), are classified by the sequence
(rd)d∈ℕ* of rests of the division of k by all
standard numbers d. The possible sequences are those which satisfy not only
rd<d but also the compatibility formulas :
∀n,d ∈ℕ*, ∃h,
rd⋅n = d⋅h + rd
(where in fact h<n). The simplest one is where all rd
are 0, i.e. where k is divisible by every standard number (but the distinguishing property
of this isomorphism class of models with 1 generator, "there exists a number divisible by
every standard number", is inexpressible in Presburger arithmetic).
The non-standard models generated by 2 non-standard numbers
k,k' can be split into the following classification depending
on what may be intuitively described as the (standard) real number which
k/k' is infinitely close to:
- k/k' may be infinitely large or infinitely small;
- It may be close to an irrational number
- But the cases when it is close to a rational number
a/b for standard numbers a, b, are reducible to other cases:
- If the difference between b⋅k and a⋅k'
is standard then the model is actually generated by only one element
(any non-standard element is a generator).
- If this difference is non-standard then, replacing one generator
by this difference, reduces this model to the first case.
4.10. Developing theories : definitions
Given a theory
T (with types, structures and axioms), a development
of T is a extended theory T' = T∪(pack of
extra components) which remains «equivalent» to T.
This concept has 3 aspects:
- In which sense is T' «equivalent» to T :
- For any extension T' of T, any model of T'
defines by restriction a unique model of T ;
- Such a T' is qualified a development of T if
conversely, any model of T has a «unique extension» as a model
of T', in a sense we need to specify.
- A list of developments schemes, where each is the data of a pair
(X,Y) of theoretical components (a sub-theory X of a theory
Y) so that any morphism from X to a theory T allows to add to
T a copy of the rest of Y beyond its sub-theory X
identified as its image in T. A list of developments schemes
is sufficient when any legitimate development is producible by a succession
of steps each described by one of these schemes.
- On the formal side: how, for each such scheme, the use of extra components in
works (expressions, proofs...) with T' can be translated
into (usually longer) works with T, and thus be seen as
abbreviations of them.
There are 3 levels of development with their respective schemes:
- A proof only adds one ground formula to the axioms, as this
formula is shown to be a theorem (deduced from some of the previous axioms).
- A definition adds both a new structure symbol and a
new axiom.
- A construction adds altogether a new type, new
structure symbol(s) and new axiom(s).
The use of a theorem as an available axiom for the proofs of the
next theorems, abbreviates a repetition of its proof as a part of
the next proofs (with possible substitutions of variables by terms
to which the theorem is applied).
When inserting theories in the set theoretical framework,
constructions become set theoretical definitions, which is why the
distinction between definitions and constructions is not usually
clarified in the literature.
The Galois connection (Mod,Tru)
For any language L, the truth relation between ground formulas and systems defines
a Galois connection (Mod,Tru), where Mod gives the class of models of any set A
of ground formulas taken as axioms, while Tru gives the set of true formulas which all given
systems have in common. Replacing
the class of all systems by the set of all L-structures on ℕ, leaves unchanged and
thus gives definite sense to the closure Tru০Mod. All theorems of a theory belong to the
closure Tru০Mod(A) of axioms, which is the condition for a formula to leave
unchanged Mod A when it is added to A.
Schemes of definitions
We have 3 schemes of definitions:
- A formula φ defines a new relation symbol R with
arguments the free variables of φ, with the new axiom
∀(variables), R(variables)⇔φ(variables).
The use of R can be replaced by the use of φ as a
sub-formula of other formulas (replacing the free variables of φ by
the terms used as arguments of R).
- A term t defines an operation symbol S,
with the axiom ∀(variables), S(variables)=t(variables).
This S abbreviates, and can be replaced by, its defining term t.
- Given a formula φ(variables, x), if we have
(∀(variables),∃!x, φ(variables, x)) then it defines
a new operation symbol S, with the new axiom
∀(variables), φ(variables, S(variables)). This
scheme may be split in 2 steps : the definition of a
relation symbol R by φ (by the first scheme); then a reduced
version of the last scheme directly using R instead of a
formula φ. This case was discussed with the uniqueness
quantifiers (2.4).
Extending models by undefined structures
Thanks to the completeness
theorem, if every model of T can be extended as a model of a T' =
(T∪{R,A}) where R is a new symbol
and A is a new (set of) axiom(s), then every ground formula of T
provable in T', is also provable in T.
The converse may be wrong: a theory T may be extensible to a theory T'
unable to prove more of its formulas, but to which not all its models are extensible.
Here are some examples from non-standard
models of arithmetic :
- T is any arithmetic and T' expresses non-standardness by letting
R be a constant (number), and A the axiom schema making it non-standard.
- The other expression of non-standardness by a unary predicate symbol R which may mean the standardness predicate in a non-standard model, as A says the
the induction axiom fails on R: R(0) ∧ (∀x, R(x)
⇒ R(S(x))) ∧ (∃x, ¬R(x))
- T = bare arithmetic; T' = Presburger arithmetic
- T = Presburger arithmetic; T' = full arithmetic (with addition and multiplication)
Definitions extend models
For any theory T with a model E, extensions T' of T
by an additional axiom and/or structure symbol may have different effects:
- If T' only has a new axiom A then E may stay or
not as a model of T', depending whether A is true or false in E;
it always does if A is provable from previous axioms;
- If T' only has a new symbol then E may have several extensions as
models of T' by giving this symbol arbitrary interpretations, except for special cases
- Definitions precisely combine a new symbol R and a new axiom A,
so that any model of T has a unique extension to an interpretation of R
that satisfies A, forming a model of the extended theory
(T∪{R,A}).
The uniqueness of the extension for every E, means that in the doubly extended theory
(T∪{R,A,R1,A1})
where (R1,A1) is a copy of (R,A),
one can prove
∀(variables), R⇔R1
Definitions preserve sets of isomorphisms
For any isomorphism f between models E
and F of T, with respective extensions as models E'
and F' of a theory T' extending T by further structures and axioms,
f may stay or not as an isomorphism between E'
and F', i.e. preserve the interpretation of the new structure(s). It will be an
isomorphism whenever the additional axiom(s) form a definition, determining the new
structure(s) and thus making it invariant by isomorphisms. Thus, adding a defined structure to a theory
preserves the sets of isomorphisms between its models.
In particular, any relation defined by an expression with language L⊂R without parameters
is invariant by automorphisms, so is in its closure
Inv(AutL E).
This means that adding it to L preserves AutL E.
As we saw how the switch from models to isomorphisms provides similarities between development levels
from proofs to definitions, we shall now see it also providing similarities from definitions to constructions,
making sense to how constructions also essentially preserve theories.
4.11. Constructions
Diverse construction schemes let new types play the roles of sets defined in set
theory by operations between sets named by previous types:
Each of the following schemes suffices to produce both others through the use of previous ones:
- From a type E, a binary relation R on E and the theorem
making R an equivalence relation (∀E x,y, xRy
⇔ (∀E z,xRz⇔yRz)), we can construct a
type Q in the role of the quotient E/R
with a function symbol q:E→Q, and axioms
- ∀E x,y, q(x)=q(y)
⇔ xRy
- ∀Q x, ∃E y, q(y)=x
Any structure S(u,v,...,k) with an argument k of type Q,
abbreviates a structure S'(u,v,...,x),
where x is of type E, with the axiom
∀u,v,...∀E x,y, xRy ⇒
S'(u,v,...,x)=S'(u,v,...,y)
(thanks to properties of equivalence
relations and quotient)
- A type of unary relations: from 2 types E,F
and a relation R between them, we construct the image Q
of E in ℘(F) by the curried R, by 2 new
structures
- a function q:E→Q
- a relation R' between Q and F,
and axioms
- ∀Q x,∃E y, q(y)=x
- ∀E x,∀F y, q(x)R'
y ⇔xRy
- ∀Q x,y, (∀F z, xR' z ⇔
yR' z) ⇒ x=y.
Or equivalently,∀E x,y,(∀z∈F,
xRz ⇔ yRz)⇒ q(x)=q(y)
The last 2 axioms are together equivalent to
- ∀E x,∀Q y,
(∀F z,
yR' z ⇔xRz) ⇔ q(x) =y.
- A type of structures K of any kind defined by a
fixed expression with parameters, meaning all structures
defined by this expression with all values of parameters. Its structures are:
- A quotient operation, from parameters to K
- An evaluator, interpreting K as a set of structures.
The last one can be obtained by composing constructions of products
(to collapse bound variables as one, and the same for parameters),
a definition and a type of unary relations or functions
(either as particular relations, or by adapting the form of axioms, replacing ⇔ by = ).
A development scheme at each level looks like a component at the
next level
Despite their differences of nature, the 3 levels of development
have remarkable similarities:
- A construction of a type of structures is given by a variable structure,
that is a structure symbol with a selection of variables to play
the role of parameters. Structures in the language are invariant, i.e.
without parameter.
This construction is worthless if the chosen symbol is without parameter
(or behaving as such: independent of parameters) which would give singleton; or
without other variable, as it would give the Boolean type.
- Definitions look like axioms: a definition of a predicate is given by a formula
with free variables; axioms are chosen from ground formulas. Any formula
merely defining a Boolean constant, i.e. with value depending only
on the model and not on variables, such as ground formulas or provable formulas
(in the sense that any free variables are implicitly
bound by ∀), do not play any effective role of structures of the
described system (giving roles to objects). Predicates defined without using the language,
are not structuring either.
- As systems using axioms, proofs are similar to contradictions. A
contradiction
is a kind of proof with no theorem but 0 (False),
which would be a very bad axiom. We might regard contradictions
as a 4th kind of component of theories after types, symbols and axioms; not
only the list of given contradictions should be empty for a theory to be of interest,
but the set of possible contradictions should be empty as well.
How constructions preserve isomorphisms
From any theory T, consider any extension T' =
T∪X∪R∪A
where X is a type, R is a pack of structures and A
is a pack of axioms. The existence of an extension of
a given model of T as a model of T', is
a matter of A being small enough for these X and R.
Its uniquess would not make direct sense as any identical copy would also fit.
But extensions may still be qualified as
«essentially unique»
(being all copies of each other) giving sense to how
constructions preserve models, depending on the set of isomorphisms
between them. For any models E'
and F' of T' respective extensions of models E
and F of T, and any isomorphism f : E
→ F,
- An extension of f as an isomorphism from E' to F' will have
to exist if A suffices to "determine" (make essentially unique) the
extension of the model (interpretation of X and R) in the following
sense: any two extensions are isomorphic (making unique the isomorphism class
of extended models E'), by some isomorphism whose restriction to old types is
the identity.
Therefore in this case, any formulas in T' without free variable in the new type
are provably equivalent to formulas expressed in T, may they be ground formulas
(as they describe the extended model) or with some free variables in old types
(as they may define structures on previous types, which the isomorphisms must preserve).
- The uniqueness of the extension of f is a matter of T' having enough
structures to prevent E' from having any more symmetry (nontrivial
automorphism) than E.
For example if R is empty, if the interpretations X and X' in E' and
F' are equinumerous then any bijection between them extends f
as an isomorphism between E'
and F'; this is non-unique when X has at least 2 elements.
- The extension T' of T is a
construction of X in T, when it ensures both the existence
and uniqueness of an extension of f to X as an isomorphism
from E' to F'. This can be formally verified in the doubly extended
theory T''=
(T∪X∪R∪A∪X'∪R'∪A'),
by defining without parameter, a function from X to X',
and proving that it is a bijection extending IdE as
an isomorphism j
between both models E' , E'' of T'
obtained by restricting a model of T'' to copies
(T∪X∪R∪A)
and (T∪X'∪R'∪A') of T'.
Indeed in this case, any 2 isomorphisms g,h from E' to F'
with the same restriction to E, will be equal because
The automorphism h−1০g of E'
extended by IdX' forms an automorphism of the model
(E'∪E'') of T''. As j is invariant, it stays equal
to its image j০(h−1০g)−1
by this automorphism. Therefore h−1০g
= IdE'
, i.e. g=h.
What we can see here strictly preserved by constructions, is the class of isomorphisms as an
abstract category. In particular, constructions
leave unchanged the automorphism groups seen as abstract groups, providing more types on which they act.