4. Arithmetic and first-order foundations

4.1. Algebraic terms

Algebraic drafts

The concept of algebraic term with a purely algebraic language L and a set V of variable symbols (no predicate, logical symbols nor binders), which was first intuitively introduced in 1.5), is going to be formalized as systems in a set theoretical framework. For convenience let us work with one type (the generalization to many types is easy), and start with a wider class of systems.
Let us call L-draft any L'-system (D,D) where D⊂ (LDD, such that:

Let us denote ∀xOD, ΨD(x) = (σ(x), lx) ∈ LD where σ∈LOD and lxDnσ(x).
Equivalent formulations of well-foundedness are

AOD, (∀xOD, Im lxAVxA) ⇒ A=OD
AD, (VDAD*(LA) ⊂A) ⇒ A = D
AD, VDAD ⇒ ∃xOD\A, Im lxA
AOD, A≠∅ ⇒ ∃xA, A∩ Im lx = ∅

A ground draft is a draft with no variable, i.e. VD=∅. Thus, ΨD: DLD and SubLD = {D}.
Variables in a draft may be reinterpreted as constants: extending ΨD by IdVD : VDV, forms a ground (LV)-draft.

Sub-drafts and terms

Drafts do not have interesting stable subsets (by well-foundedness), but another stability concept: a subset AD is a sub-draft of D (or a co-stable subset of D) if, denoting OA = AOD and ΨA = ΨD|OA, we have (Im ΨALA), i.e. ∀xOA, Im lxA.
Indeed, it remains well-founded, as can be seen on the last formulation of well-foundedness.

Like with stable subsets, any intersection of sub-drafts is a sub-draft. Moreover, any union of sub-drafts is also a sub-draft (unlike for sub-algebras with an operation with arity >1, which from arguments in different sub-algebras may give a result escaping their union).

The sub-draft co-generated by a subset of a draft, is the intersection of all sub-drafts that include it. A term is a draft co-generated by a single element that is its root. Each x in a draft D defines a term Tx with root x, sub-draft of D co-generated by {x}.
Each draft D is ordered by xyxTy. It is obviously a preorder. Proof of antisymmetry (uniqueness of the root):
xOD, VDA={yD|xTy} which is a sub-draft by transitivity of ≤.
xA ∴ ∃zOD\A, Im lzA.
A∪{z} is a sub-draft, thus TzA∪{z} by definition of Tz.
zOD\AxTzx=z. Thus x is determined by A. ∎
More properties of this order will be seen for natural numbers in 3.6, and in the general case with well-founded relations in the study of Galois connections.

Categories of drafts

As particular relational systems, classes of L-drafts form concrete categories. Between two L-drafts D,E,

f ∈MorL(D,E) ⇔ (f[OD]⊂OE ∧ ΨEf|OD= Lf০ΨD)

where the equality condition can be split as

σEf|OD = σD
xOD, lf(x)=flx

Another concrete category is that of drafts with variables-preserving morphisms, where V is fixed and morphisms f from a draft D are subject to f|VD = IdVD. This is equivalently expressed reinterpreting variables as constants, as the category of ground (LV)-drafts.

Intepretations of drafts in algebras

For any L-draft D and any L-algebra E, an interpretation of D in E is a morphism f∈MorL(D,E), i.e. f|OD= φELf০ΨD, also expressible as

xOD, f(x) = σ(x)E(flx)

Any interpretation vEV of variables in an algebra E determines an interpretation of any draft D in E. To simplify formulations, restricting v to VD reduces the problem to the case VD=V.

Theorem. For any L-draft D with VD=V and any L-algebra E, any vEV is uniquely extensible to an interpretation of D:
∃!h∈MorL(D,E), h|V = v, equivalently ∃!hEOD, vh ∈ MorL(D,E).

Uniqueness is deduced from well-foundedness : ∀g,h∈MorL(D,E), g|V = v = h|VV⊂ {xD|g(x) = h(x)} ∈ SubLDg=h.
Let us now prove existence (using conditional operator).
S = {AD | VA ∧ Im ΨALA}
vK = ⋃AS {f∈MorL(A,E) | f|V =v}
f,gK, B = Dom f ∩ Dom g ⇒ (f|BKg|BK) ⇒ f|B=g|B
fK Gr f = Gr h
C= Dom h = ⋃fK Dom fS
hK
(CD*(LC) ∋ x↦ (xC ? h(x) : φE(LhD(x))))) ∈ K
D*(LC) ⊂ C
C=D

Operations defined by terms

Any element t of an L-draft D defines a V-ary operation symbol, interpreted in each L-algebra E by ∀vEV, tE(v) = h(t) for the unique h∈MorL(D,E) such that h|VD = v|VD. This formalizes the operation defined by a term, namely the L-term with root t in D (which can replace D here without modifying the interpretations of t).

This interpreted operation symbol being the structure defined by (IdV,t), is preserved by all L-morphisms, thus can be added to L without changing the category of L-algebras.
Symbols sL come back as the particular cases of the terms they form themselves where Ψ(t) = (s, IdV).
For the case of "small" (concretely written) terms, this preservation also has a schema of one proof for each term: re-expressing the term as a formula defining a relation (graph of the operation) using symbols ∃ and ∧, we can use the preservation of structures defined by such formulas.

4.2. Term algebras

Condensed drafts

A draft D is condensed if, equivalently,
  1. D is functional, i.e. ΨD is injective;
  2. D has an injective interpretation in some algebra;
  3. For any two distinct elements of D there is an algebra interpreting them differently.
1.⇒2. if D≠∅ (otherwise replace D by a singleton), ∃φ∈DLD, φ০ΨD = IdOD, i.e. IdD interprets D in (D,φ).
2.⇒3.
3.⇒1. ∀x,yOD, if ΨD(x) = ΨD(y) then x,y have the same interpretation in every algebra.

Any draft D can be reduced to a condensed draft as follows.

Give ℘(D) the L-algebra structure (s,u) ↦ {xOD | σ(x)=slx∈∏u}.
Then the interpretation of D in ℘(D) which sends any variable x to {x}, is the curried form of the only equivalence relation on D which quotients it into a condensed draft.
Let us call condensation of D this projection of D to a condensed draft. This is the (not typographically convenient) way to confuse the repeated sub-terms of a given term (or draft), which will have the same meaning in all algebras.
This equivalence relation is included in that of any interpretation of D in any algebra, thus quotienting interpretations of D.

We can also compare separately given terms by reducing them to this case as any disjoint union of drafts (only keeping variables in common) is a draft.

If L only has symbols with arity 0 or 1 then every L-term is condensed.

Term algebras

An L-algebra (EE) is called a term algebra if it is injective and 〈E\Im φEL = E. Thus it is also a condensed L-draft with ΨE = φE-1. Another usual assumption is that V=VD=E\Im φE, i.e. used variables of a term algebra are usually not regarded as a subset of any larger set of available variables. This term algebra is ground if V=∅, i.e. E=Im φE. So, a ground term L-algebra is an algebra both minimal and injective, and thus also bijective.

If L has no constant then ∅ is a ground term L-algebra.
If L only has constants, then ground term L-algebras are the copies of L.
From any injective L-algebra (EE) and VE \ Im φE one can form the term algebra 〈VL. In particular the existence of an injective algebra implies that of a ground term algebra.

Whenever present as object, any ground term L-algebra is an initial object in any category of L-algebras, and a final object in any category of ground L-drafts. It is thus essentially unique for the given L.
Conversely any initial L-algebra (EE) is a ground term algebra: Similarly, term L-algebras (EE) with E\Im φE=V are the final objects in any variables-preserving category of L-drafts for a given V (this is deducible from the above by replacing variables by constants).

Proposition. For any ground term L-algebra K and any injective L-algebra M, the unique f∈MorL(K,M) is injective.

Proof 1. By the injectivity lemma, {xK | ∀yK, f(x) = f(y) ⇒ x=y} ∈ SubLK, thus = K.
Proof 2. Im f ∈ SubLM is both injective and minimal, thus a ground term L-algebra, so the morphism f between initial L-algebras K and Im f is an isomorphism.

Role of term algebras as sets of all terms

As any draft can be seen as a family of terms, any term algebra (final draft) F precisely plays the more role of the "set of all terms" (with the given list V of variable symbols), as it contains exactly one representative image of each term (operation symbol defined by a term), i.e. any two "equivalent terms" (defining the same operation) have the same image. Namely, any L-term T with root t and VTV, is represented in F by the image of the f∈Mor(T,F) such that f|VT = IdVT, with root f(t).
This f plays the role of condensation (so is injective if and only if T is condensed), respecting the interpretation in any L-algebra E extending any vEV, as the unique g∈MorL(F,E) and h∈MorL(T,E) extending v, are related by h=gf, thus h(t)=g(f(t)).

For any subset A of an L-algebra E and any term algebra whose set of variables is a copy of A, the image of its interpretation in E is 〈AL.

Free monoids

If it exists, any unary term L-algebra M (essentially determined by L), is a monoid acting on all L-algebras, whose actions are preserved by all L-morphisms (MorL ⊂ MorM as they form a category of acts). It serves as set of all (function symbols defined by) unary L-terms. The set of function symbols from L is canonically injected there by j(s) = sM(e) so that in any L-algebra E, ∀xE, j(s)⋅x = sE(x).
Conversely if L is made of function symbols then essentially LM, thus MorL = MorM.
The monoid structure of M was defined from its L-structure; now let us take the monoid structure as primitive and review alternative descriptions from it, of the situation when L was made of function symbols. These function symbols can be replaced by (reinterpreted as) constant symbols, as these 2 interpretations can be defined from each other by terms using the monoid structure: the function defines the constant as s = s(e), while the constant defines the function as ∀xM, s(x) = sx.
For any set X, let us call X-monoid any (X∪{e,•})-algebra M seeing X as a set of constant symbols by some j:XM, such that (e,•) is a monoid structure on M.
Denoting X1 the copy of X seen a set of function symbols, the following statements on M are equivalent; such an M is called a free monoid on X.
  1. M is a unary term X1-algebra with variable e, interpreting the copy x'∈X1 of each xX as ∀yM, x'M(y) = j(x)•y
  2. For any X1-algebra E there is a unique left action ⋅ of M on E such that ∀xX, ∀yE, j(x)⋅y = xE(y)
  3. M is an initial object in the category of X-monoids.
(The proof of this equivalence remains to be completed, using the property of trajectories and the representation theorem)
1. is equivalent to: for any X1-algebra E and any xE there is a unique hx ∈ MorX1(M,E) such that hx(e)=x. That is the curried form of the action ⋅ : M×EE.
The uniqueness of the morphism to other X-monoids is expressed by 〈X{e,•} = M.

When writing terms with multiple uses of an associative operation symbol, all parenthesis may be removed. For monoids, this removal of parenthesis and also of occurrences of e seen as the empty chain of symbols, is operated by the interpretation of any V-ary {e,•}-term in the free monoid on V.
The image of M by any morphism of monoid is the sub-monoid generated by the image of L.

Examples of varieties

The above gives two examples of varieties

4.3. Integers and recursion

The set ℕ

An arithmetic is any theory describing the system ℕ of natural numbers. There are diverse such theories, depending on the choice of a logical framework, then of a language and axioms. First is the set theoretical one, which is the most precise as it determines the isomorphism class of ℕ in the given universe :

Definition. ℕ is a ground term algebra with two symbols: a constant symbol 0 called zero, and a unary symbol S called the successor.

The essential uniqueness of such systems removes any uncertainty attached to fixing a choice of ℕ among them, at least once any irrelevant questions are made inexpressible by seeing ℕ as a set of pure elements.
The existence of a ground term {0,S}-algebra is our first expression of the axiom of infinity, which completes the set theory we progressively introduced from the beginning to the powerset (with optionally the axiom of choice), to form what is essentially the standard foundation of mathematics as practiced by most mathematicians. It will imply the existence of term algebras with any language.
The above use of algebraic concepts in the definition of ℕ may make it look circular, as our study of algebras used natural numbers as arities of operation symbols. But these definitions may also be written without reference to numbers when used symbols have small arities only (for arithmetic, arities are first just 0 and 1, then later 2).
Namely, the most convenient expression of the axiom of infinity consists in inserting arithmetic into set theory, by first inserting the language of {0,S}-algebra in the form of 3 constant symbols: ℕ (as a set) and the images of both algebraic symbols 0 and S, with axioms 0∈ℕ and S:ℕ→ℕ; then making this a ground term algebra by the following 3 axioms :
(H0) n∈ℕ, Sn ≠ 0 : 0 ∉ Im S
(Inj) n,p∈ℕ, Sn = Sp n = p : S is injective
(Ind) A⊂ℕ, (0∈A ∧ ∀nA,SnA) ⇒ A=ℕ (induction) : ℕ is minimal.

More constant symbols can be defined from there: 1=S0, 2=S1=SS0, 3=S2=SSS0, ...

Recursively defined sequences

A sequence of elements of a set E, is a function from ℕ to E (a family of elements of E indexed by ℕ).
In particular, a recursive sequence in E is a sequence defined as the only uE such that u ∈ Mor(ℕ,(E,a,f)), where (E,a,f) is the {0,S}-algebra E interpreting 0 as aE and S as fEE :

u0=a
n∈ℕ, uSn = f(un).

As un also depends on a and f, let us write it as f n(a). This notation is motivated as follows.
As an element of a ground term {0,S}-algebra, each number n represents a term with symbols 0 and S respectively interpreted as a and f in E. So, f n(a) abbreviates the term with shape n using symbols a and f:
f 0(a) = a
f 1(a) = f(a)
f 2(a) = f(f(a))
Re-interpreting the constant 0 as a variable, makes ℕ a unary term {S}-algebra, where each number n is a unary term Sn with n occurrences of S, interpreted in each {S}-algebra (E,f) as the function f nEE, recursively defined by

f 0 = IdE
n∈ℕ, f Sn = ff n

In particular, f 1=f and f 2 = ff.
Generally for any fEE, gEX, the sequence (hn)n∈ℕ in EX recursively defined by (h0=g) and (∀n∈ℕ, hSn = fhn) is hn=f ng.

Addition

Like any unary term algebra, ℕ forms a monoid (ℕ, 0, +), whose actions are the sequences (f n) for any transformation f.
It is commutative as it is generated by a singleton, {1} (which commutes with itself). Thus the side won't matter, but conventions basically present it as acting on the right, defining addition as n+p = Sp(n), or recursively as

n + 0 = n
p∈ℕ, n+S(p) = S(n+p).

Thus, n+1 = S(n+0) = Sn.
Like in the general case, the action formula ∀n,p∈ℕ, f n+p = f pf n is deduced from
(fn+0=fn ∧ ∀p∈ℕ, fn+Sp = fS(n+p) = ffn+p) ∴ ∀aE, ∀fEE, (pf n+p(a))∈Mor(ℕ,(E,fn(a),f)),
In this view, associativity appears as (a+b)+n = Sn(Sb(a)) = Sb+n(a) = a+(b+n). From now on, use of associativity will be implicit, omitting parenthesis.

Multiplication

By the concept of cardinals of finite sets (counting their elements) that will be defined in 4.1., multiplication in ℕ may be defined as the cardinal of a product, making obvious its basic properties, including commutativity. Without this, multiplication can be defined by the following recursion, which needs to treat sides differently until commutativity is deduced. Let us choose the side that fits common language, though it is opposite (transpose) to the usual one from the literature on the axioms of arithmetic:
x∈ℕ, 0⋅x = 0
x,y∈ℕ, (Sx)⋅y = (xy)+y
which can be summed up as xy = (Sy)x(0). Then generally, ∀fEE, f xy = (f y)x.
x∈ℕ, x⋅0 = 0 is deduced by induction.
Right distributivity ∀x,y,z∈ℕ, (x+y)⋅z = xz + yz comes by induction on y, or as (Sz)x+y = (Sz)y০(Sz)x.
Left distributivity ∀x,y,z∈ℕ, x⋅(y+z) = xy + xz comes by induction on x using commutativity of +. In particular this gives ∀x,y∈ℕ, xSy = (xy)+x, so that the transpose of multiplication, satisfying the same recursive definition, coincides with it : multiplication is commutative.

Inversed recursion and relative integers

By induction using commutativity (n+1 = 1+n),

f,gEE, gf = IdE ⇒ ∀n∈ℕ, gnf n = IdE.

Thus if f is a permutation then f n is also a permutation, with inverse (f n)-1 = (f -1)n.
Commutativity was just here to show that composing n times is insensitive to sides reversal, as (f n)-1 has the more direct recursive definition

(f Sn)-1 = (fn)-1f.

The system of (relative) integers ℤ is defined as the {0,S}-algebra where Proposition. ℤ is a commutative group, and for any permutation f of a set E, there exists a unique (f n)n∈ℤ which is, equivalently, a {0,S}-morphism from ℤ to (EE, IdE, f), or an action of ℤ on E interpreting 1 as f.

Proof: the {0,S}-morphism condition requires on ℕ the same nf n as above; on -ℕ, it recursively defines f -n = (f -1)n, namely This makes (ℤ,0,S) an initial object in the category of {0,S}-algebras (E,a,f) where f is a permutation. This category is derived as described for eggs from that of {S}-algebras (E,f) where f is a permutation. Therefore, ℤ is a monoid acting by (f n)n∈ℤ on every set E with a permutation f.
This monoid is a commutative group because it is generated by {-1, 1}, which commute and are the inverse of each other : (-1)+1=0=1+(-1).
The formula of its inverse, (-n)+n = 0 = n+(-n) can be deduced either from symmetry ((-n)+n∈ℕ ⇔ n+(-n)∈-ℕ) and commutativity, or from the above result.

4.4. Presburger Arithmetic

First-order theories of arithmetic

Our first formalization of ℕ was based on the framework of set theory, using the powerset to determine the isomorphism class of ℕ. This allowed recursion, from which addition and multiplication could be defined.

Let us now review first-order theories describing ℕ as their only type, called theories of first-order arithmetic. As first-order logic cannot fully express the powerset, the (∀A⊂ℕ) in the induction axiom must be replaced by a weaker version : it can only be expressed with A ranging over all classes of the theory, that is, the only subsets of ℕ defined by first-order formulas in the given language, with bound variables and parameters in ℕ. For this, it must take the form of a schema of axioms, one for each formula that can define a class.

But as the set theoretical framework was involved to justify recursive definitions, the successor function no more suffices to define addition and multiplication in first-order logic. This leaves us with several non-equivalent versions of first-order arithmetic depending on the choice of the primitive language, thus non-equivalent versions of the axiom schema of induction whose range of expressible classes depends on this language:

Presburger arithmetic

A minimal formalization describes the set ℕ* of nonzero natural numbers, with symbols 1 (constant) and + (binary operation), and axioms :
x,y∈ℕ*, x + (y+1) = (x+y)+1 (A1) : + is associative on 1
A⊂ℕ*,(1∈A ∧ ∀x,yA, x+yA) ⇒A=ℕ* (Min)
x,y∈ℕ*, x + y y (F)

In set theory, (Min) would make ℕ* a minimal {1,+}-algebra. But we shall use set theoretical notations in such ways that they can be read as abbreviations of works in first-order logic: as long as we only consider subsets of ℕ* defined by first-order formulas in this arithmetical language, (Ind) and (Min) can be read as abbreviations of schemas of statements, A ranging over all classes in this theory.
(A1) is a particular case of
x,y,z∈ℕ*, x + (y+z) = (x+y)+z (As) : + is associative

Conversely, (A1 ∧ Min) ⇒ (As) :
Let A={a∈ℕ* |∀x,y ∈ℕ*, x+(y+a) = (x+y)+a}. ∀a,bA,
x,y ∈ℕ*, x + (y+(a+b)) = x + ((y+a)+b) = (x + (y+a))+b = ((x + y)+a)+b = (x+y)+(a+b)
a+b A.
(A1) ⇔ 1∈A.
(A1 ∧ Min) ⇒ A=ℕ* ∎
(As ∧ Min) ⇒ (+ is commutative), as deduced from 1∈C({1}).

Now take ℕ = ℕ*∪{0} where 0∉ℕ*, to which + is extended as ∀n∈ℕ, 0+n = n+0 = n. This extension preserves its properties of commutativity and associativity.
Define S as ℕ∋xx+1.
These definitions directly imply (H0).

Let (Ind1) be the statement ∀A⊂ℕ*, (1∈A ∧ (∀xA, x+1 ∈A)) ⇒ A=ℕ*.
(Ind1) ⇒ (Ind) ; (Ind) ⇒ (Ind1) using A∪{0}.
One may instead use {x∈ℕ|x+1∈A} once noted that (Ind) ⇒ Im S = ℕ*.
(Ind1) ⇒ (Min)
(As ∧ Min) ⇒ (Ind) in set theory (ignoring our previous definition of ℕ):
Let M = Min{0,S}ℕ.
xM, M ∈ Sub(ℕ,x,S) ∧ fx = (Myx+y) ∈ Mor{S}(M,ℕ).
Im fx = fx [〈0〉{S}] = 〈x{S}M.
As M is stable by + and contains 1, it equals ℕ.∎
(As ∧ Min) ⇒ (Ind) as directly convertible to first-order logic:
Let A∈Sub{0,S}ℕ, and B = {y∈ℕ* |∀xA, x+yA}.
y,zB, (∀xA, x+yx+y+zA) ∴ y+zB.
(∀xA, x+1 ∈A) ⇔ 1∈B ⇒ ((Min)⇒ B=ℕ*).
0∈A ⇒ (∀yB, 0+yA) ⇒ BA.∎
So, all possible axiom pairs are equivalent: (A1 ∧ Min) ⇔ (As ∧ Min) ⇔ (As ∧ Ind) ⇔ (A1 ∧ Ind), and imply commutativity.

Parity

A number n∈ℕ is even if ∃x∈ℕ, n=x+x; it is odd if ∃x∈ℕ, n=x+x+1.
Obviously, if n is even then n+1 is odd; if n is odd then n+1 is even thanks to commutativity. Thus by induction, any number is either even or odd. But to show that it cannot be both and that this x is unique, also requires the use of (Inj). These are left as a possible exercise. But instead of (Inj) we proposed axiom (F). This raises the question of their equivalence.
(F) ⇔ (∀x∈ℕ*, ∀y∈ℕ, x+y y) because x+0 = x ≠ 0.
(Inj ∧ Ind ∧ A1) ⇒ (F) because ∀x∈ℕ*, {y∈ℕ | x+yy} ∈ Sub{0,S}ℕ .
For the converse, we need to use the order relation.

The order relation

In any model of Presburger arithmetic, let us define binary relations ≤ and < as

x<y ⇔ ∃z∈ℕ*, y = x+z
xy ⇔ ∃z∈ℕ, y = x+z

(A1 ∧ Ind) implies that
  1. < is transitive
  2. xy ⇔ (x<yx=y)
  3. x<yx+1≤y
  4. A⊂ℕ, A≠∅ ⇒ ∃xA, ∀yA, xy (meaning a schema of formulas in Presburger arithmetic)
  5. x,y∈ℕ, xyyx
  6. x<yx+z < y+z
Proofs :
  1. using (As), x < y < z ⇒ (∃n,p∈ℕ*, z = y+p = x+n+p) ⇒ x < z.
  2. obvious from definitions;
  3. thanks to (Ind), ℕ is a bijective {0,S}-algebra;
  4. xy ⇒ (x+1≤yx=y)
    0∈{x∈ℕ |∀yA, xy}=B
    xB, x+1∈BxA
    AB=∅ ⇒ (B=ℕ ∴ A=∅)
  5. from 4. with A={x,y}. Or using 3, A={x∈ℕ |∀y∈ℕ, x<yx=y y<x} ⇒ (0∈A ∧ ∀xA, x+1∈A)
  6. y = x+ny+z = x+z+n
Now the full system (A1 ∧ Ind ∧ F) implies that Proof. (F) means that < is irreflexive, which with transitivity (1.) implies that it is a strict order, which is total by 5.
There must be one true formula among (x<y), (x = y), (y<x), which by 6. respectively imply (x+z<y+z), (x+z = y+z), (y+z<x+z). But only one of the latter can be true, thus each implication must be an equivalence. Cancellativity on one side extends to the other side by commutativity. ∎
Finally, by 4., every nonempty subset A of ℕ has a smallest element (unique by antisymmetry), written min A.
This order coincides with the order between terms in the common particular case of the set theoretical ℕ, as will be clear from the properties of generated preorders.

Arithmetic with order

It is possible to express a first-order arithmetic with language {0,S, ≤}, more expressive than {0,S} but less than Presburger arithmetic, in the sense that addition cannot be defined from ≤.
There, ≤ is related to S by the following property (which determines it in set theory, but no more in bare arithmetic due to the poverty of its interpretation of induction by an axiom schema):
For all n ∈ℕ, the set {x∈ℕ | nx} is the unique A⊂ℕ such that
x∈ℕ, xA ⇔ (x = n ∨ ∃yA, Sy=x).
Its existence in ℘(ℕ) can be deduced in set theory (not first-order arithmetic) by induction on n; its uniqueness for a fixed n is deduced by induction on x.

Trajectories of recursive sequences

For any recursive sequence u∈Mor(ℕ,(E,a,f)), the trajectory K = Im u = {f n(a) | n∈ℕ} of an action of ℕ on E is generated by a, which is a free element for the image of ℕ as a transformation monoid of K thanks to the commutativity of +. Therefore K can be seen as a commutative monoid, whose description coincides with the above arithmetic without axiom (F), where the roles of the neutral element 0 and the generator 1 are respectively played by a and f(a). However for convenience, let us focus on the set theoretical viewpoint on the remaining case, when u is not injective so that K is not a copy of ℕ. Then K must be a non-injective {0,S}-algebra: there must be a pair in {0,S}⋆K mapped to a singleton, but we shall see that such a pair is unique.
Let y the minimal number such that ∃x<y, ux = uy. This x is unique because y is minimal.
{un | n<y} ∈ Sub{0,S}K, thus =K.
As Inj f|Kx=0 ⇔ a∈ Im f|Kfy(a)=a ⇔ (f|K)y = IdK, a trajectory K where these are true is called a cycle of f with period y; the restriction of f to K is then a permutation of K. Then replacing a by another element of K would leave both K and y unaffected.
Now if f is a permutation then every cycle of f is also a cycle of f -1 with the same period.

4.5. Finiteness and countability

To review equivalent forms of the axiom of infinity, we need to work without assuming it. Let us start with a technical result.

Covering of stable subsets by terms

For any L'-system (E,E), any XE and any x∈〈XL in E there exists a draft (D,D) where xDE, DE and VDX.
It makes no difference to fix VD=X, or X=∅ by adding it as a set of constants to the language.

Proof, fixing X=∅ to simplify.
Let us show the stability of the union B of the D such that there exists a ground draft (D,D) with DE.
Let (s,x,y)∈E such that Im xB. By finite choice there exists a tuple of ground drafts (Di,Di)i<ns such that ∀i<ns, xiDiDiE.
Let D = ⋃i<ns Di.
Let us check that Ψ = (Dz ↦ Ψi(z) for the smallest i such that zDi), defines a ground draft.
For any nonempty AD, take the smallest i such that ADi ≠ ∅.
zADi, ∅ = ADi∩ Im li,z = A ∩ Im li,z because Im li,zDi.
Ψ(z) = Ψi(z) because i is also smallest such that zDi, thus lz = li,z.
We conclude ∃zA, A ∩ Im lz = ∅, thus Ψ defines a ground draft on D.
Then either yD, or adding it there with (s,x,y) gives a ground draft that fits.∎

Thus independently of the axiom of infinity (existence of term algebras) anyway all elements of a minimal algebra are values of ground terms.

Finiteness

Let K={0,S}.
For any set E, let us give its powerset P=℘(E) the K'-structure defined by 0P = {∅} and SP = {(A,B)∈P2| ∃xB, A=B\{x}}.

We define the set of finite subsets of E as MinK℘(E). A set is finite if it is a finite subset of itself (otherwise it is infinite).
For any sets EF, (E is finite) ⇔ (E is a finite subset of F). Proof :

Equivalently, a set E is finite if it has a bijection to the graph of S in a ground K-term n (which can be seen as a unary {S}-term). This may as well be seen as a bijection to its domain {0,...,n-1} or to its image {1,...,n}.
Proof. From a finite set E, the above theorem gives a term with root E included in ℘(E)...

Countable sets

It is easy to check that any condensed ground K-draft is either ∅ or a ground K-term or a ground term K-algebra (only two other languages have this property, apart from those with no constant).
A set is called countable if, equivalently,

  1. It has an injection to a condensed ground K-draft
  2. It has a bijection with a condensed ground K-draft
  3. A condensed ground K-draft has a surjection to it
1.⇒2. by restricting the order, once condensed ground {0,S}-drafts are described from their order structure; another method is given below.
2.⇒(1.∧ 3.) ; 3.⇒1. without AC, by picking the smallest element of each preimage.

So, countable sets are either finite or in bijection with ℕ. The latter, on which there exists a possible structure of ground term {0,S}-algebra, will be called countably infinite. However we still need to prove that they are indeed infinite.

Finite cardinalities

Let ≈ be the equivalence relation of equinumerosity on P, defined as existence of a bijection: AB ⇔ ∃f : AB.
The quotient structure on Q=P/≈, seen as a graph G⊂({0,S}QQ is functional both ways (both it and its transpose are functional), and Im G = Q.
Indeed:

Axiom of infinity

In set theory with the powerset, the following statements are equivalent expressions of the axiom of infinity:
  1. There exists an injective algebra whose language has at least a constant and a non-constant.
  2. There exists a non-surjective injection of a set to itself; equivalently an injective {0,S}-algebra, or a set ℕ.
  3. There exists an infinite set.
  4. For any countable algebraic language there is a (countable) injective algebra
  5. Any consistent first-order theory with a countable language has a (countable) model

4. ⇒ 2. ⇒ 1.; for 1. ⇒ 2. : from a non-constant operation s there, xs(kx) is an injective, non-surjective transformation.

We shall prove 2. ⇔ 3. below (finite cardinalities), and 2. ⇒ 4. ⇒ 5. in the section on the completeness theorem.

We may regard 5. ⇒ 4. as relative to the natural assumption that the concept of injective algebra forms a consistent theory. This theory is essentially the part of set theory which describes tuples. So, if a language L is included in a model of that simple part of set theory, then there exists an injective L-algebra. If such a model U is standard, this takes the form LUU, making (U, IdLU) an injective L-algebra. ∎
The infinity of ℕ easily implies the uniqueness of the number of elements of a finite set, as its non-uniqueness would give that set a strict injection into itself, and thus an injection of ℕ into it.
2. ⇒ 3. will come from the infinity of ℕ.
For 3. ⇒ 2. we shall pick a different set, as the existence of an injective {0,S}-algebra structure on any infinite set would need the axiom of choice.

Finite cardinalities

Now denoting e the image of EP in Q, these definitions allow to "measure the size" of E (finding whether it is finite or infinite, and what is its number of elements if it is finite), by making it easy (but still non-trivial), through the fact that (Dom K)∪{(S,e)} = ({0,S}Q) to prove that things fall into either case:

Assuming the axiom of infinity, the finiteness of a set E is equivalent to the existence of a number n∈ℕ and a bijection from Vn = {x∈ℕ | x<n} to E; or a surjection from Vn to E, which is then bijective precisely when n takes its smallest value (the number of elements of E).

Without ℕ, either ℘(℘(E)) or (with more work) EE can be used to express the finiteness of a set E, and to rebuild ℕ from E if E is infinite.

Any left cancellative element of a finite monoid is invertible. Thus any finite cancellative monoid is a group.
Proof: it acts as an injective transformation of a finite set, thus a permutation, by which the preimage of e is a right inverse, thus an inverse.

Countability of ℕ×ℕ
y\x
0
1
2
3
4
0
0
1
4
9
16
1
2
3
5
10
17
2
6
7
8
11
18
3
12
13
14
15
19

A natural bijection between ℕ×ℕ and ℕ is defined by (x,y) ↦ (y < x ? xx + y : yy + x + y).
y\x
0
1
2
3
4
0
0
1
3
6
10
1
2
4
7
11
2
5
8
12
3
9
13
Another bijection b2 : ℕ×ℕ → ℕ is defined as b2(x,y) = Tx+y+y from the sequence (Tn) of triangular numbers defined by 2⋅Tn = n⋅(n+1) (see properties of parity) or equivalently by T0=0 ∧ ∀n∈ℕ, Tn+1=Tn+n+1. This makes ℕ×ℕ a ground term (0,S)-algebra where
0=(0,0)
S(0,y)=(y+1,0)
S(x+1,y)=(x,y+1)
which recursively defines b2-1(z), where x+y is the only n such that Tnz<Tn+1.

Countability of finite sequences of integers

If there is a surjection f from ℕ to a set E then E is countable, as an inverse injection can be defined by associating to each xE its smallest preimage by f.

Fixing a bijection b2 from ℕ2 to ℕ, for any integer n>0 we can recursively define a bijection bn:ℕn→ℕ, for example:
b1=Id
n>0,∀u∈ℕn, bSn(u) = b2(bn(u),un)
(which gives back b2 as the particular case of bn for n=2).
Also the set ℕ(ℕ) = {u∈ ℕ | ∃k∈ ℕ, ∀n>k, un=0} is countable. There are different ways to see it:

But this construction of ℕ(ℕ)) and similarly simple candidate expressions of H as a countable set, assumes recursion, thus cannot be used to define recursion by the above method. To actually provide a definition of recursion that does not assume it, requires a different expression of an enumerated H. The representation of sequences which was used by Godel as a basis to define recursion, actually forms a different set.

4.6. The Completeness Theorem

Existence of countable term algebras

For any countable algebraic language L with at least a constant and a non-constant, finding a countable ground term L-algebra (necessarily infinite) amonts to defining a ground term L-algebra structure on ℕ, which is a bijection from Lℕ to ℕ.
Splitting L as CD where C is the set of constants, and D is the rest of symbols, we have Lℕ = C∪(Dℕ). But C and D are countable (either finite or infinite), and Dℕ is a union over D of disjoint sets with explicitly definable bijections with ℕ (such as the bns we saw). In any case, a bijection between C∪(Dℕ) and ℕ can be found without problem.
In practice, such bijections from Lℕ to ℕ happen to be ground term L-algebra structures because ∀(s,x)∈Dℕ, ∀i<ns, xi<s(x) which would be contradicted by the smallest element outside MinL ℕ. ∎

Interpretation of first-order formulas

Trying to extend the formal construction of the interpretation of terms in algebras, to the case of formulas interpreted in systems, the difficulty is to cope with the interpretation of quantifiers (or generally binders, if we wish to still generalize).
A possibility, is to switch to a view over all variables as bound throughout the formula: from the concept of interpretation hvET of a term T in an algebra E for every interpretation v of its set V of variables, the family (hv)vEV is re-curried into a function from T to EEV. So, a first-order formula interpreted in a system E can be understood as a term interpreted in an algebra with maybe two base types : the sets OpE and RelE of all operations and all relations in E; or split as two sequences of types, OpE(n) and RelE(n). We might not need the full sets of these, but at least, an algebra of these (a subset stable by all needed logical operations).
We took unions over ℕ for the case we would need to see "all possible formulas" as terms interpreted in one same algebra.

The Completeness Theorem

Any good formal proving system (a theory describing proof syntax, verifiable by an algorithm) needs of course to be sound, which means only "proving" what is always true (the provability of a formula implies its truth in every model). More remarkable is the converse property, that is completeness :

Theorem. First-order logic has a proving system both sound and complete in the following equivalent senses

This quality of perfection, established for a certain formal proving system for first-order logic, ensures that any other formal proving system with the same quality will be necessarily equivalent : any formal proof according to one is automatically convertible into a proof according to any other.

Sketch of proof of the first statement (implicitly suggesting how such a proving system can be made; not using the axiom of choice, anyway out of topic):
Express axioms in prenex form, that is chains of quantifiers applied to quantifier-free formulas (using equivalences from 2.1 and 2.3 simplified for a nonempty model).
Interpret the symbol = in axioms as an ordinary predicate symbol, with the axioms of equality [1.6].
Replace each occurrence of ∃ in an axiom by the use of an additional operator symbol K, for example

(∀x,x', ∃y, ∀z, A(x,x',y,z)) ⇢ (∀x,x',∀z, A(x,x',K(x,x'), z)).

Let M be the ground term algebra over the language of operator symbols enriched as just described. Reinterpreting all ∀ in axioms as the use of axiom schemas (one axiom for each replacement of variables by a tuple of elements of M), gives a propositional theory T' (its axioms are composed of logical connectives over Boolean variables), whose set of Boolean variables is PM where P is the set of predicate symbols in the theory plus the "equality" symbol.
Observe that T' is still consistent, as any contradiction in T' (= finite set of axioms not satisfied by any tuple of Boolean values of their variables) would provide a contradiction in T, as follows:
From all subterms occurring in used axioms of T', list without repetition all those whose root is a symbol S which was added to the language when converting some axiom ∀x,∃y,∀z,∃u, A(x,y,z,u) of T, into the axiom ∀x,z, A(x,K(x),z, S(x,z)) in T'. Successively replace them all by variables in T: for terms S(t0,t1) and S(t0,t2) (where t0, t1, t2M), the reasoning in T will say
"Let y such that ∀z,∃u, A(t0,y,z,u);
let u1 such that A(t0,y,t1,u1);
let u2 such that A(t0,y,t2,u2);"
and use the contradiction in T' where y replaces K(t0), u1 replaces S(t0,t1) and u2 replaces S(t0,t2).
Recursively by a chosen enumeration of PM as a countable set, add one by one to the axioms, each of these Boolean variables if it is consistent with previously accepted axioms, so that all get values without contradiction.
Then the quotient of M by the equivalence relation of "equality", forms a model (with the "true equality"). ∎

As this construction depends on an arbitrary order between formulas, different choices give different models, which may be non-isomorphic and even have different properties reflecting the undecidability of the theory's formulas.

Deduction of the second statement from the first :
TA
T ∪ {¬A} is inconsistent
T ∪ {¬A} has no model
A is true in all models of T.  ∎
So, the provability function giving the set of theorems from each set of axioms, is the closure of the Galois connection defined by the truth relation RX×Y between However, as already announced in truth in mathematics and will be shown by the incompleteness theorem, unclarities remain about provability and existence of models:

Skolem's Paradox

Comparing the results of the completeness theorem with Cantor's theorem gives this paradox :
If set theory is consistent then it has countable models, though it sees some sets there, such as ℘(ℕ), as uncountable. Rigorously speaking, when the powerset axiom says that the class of subsets of E is a set written ℘(E), it can only determine the interpretation of the functor ℘ relatively to (depending on) the universe where its open quantifiers are interpreted: «all subsets of E existing in this universe are in ℘(E)». In the other interpretation of what it means for a class to be a set, this would mean that the universe already contains "really all" subsets of E, forming the supposedly "true" set ℘(E) which will stay fixed in any further expansion of the universe. However, for the powerset of an infinite set, this wish cannot be expressed in first-order logic nor any other conceivable mathematical formalism: there is no way to talk about meta-sets that "cannot be found" in this universe but would «exist elsewhere» (in any mysterious bigger universe) to exclude them from there. (Only one aspect of this idea can be formalized as the axiom schema of specification).

4.7. Non-standard models of Arithmetic

Standard and non-standard numbers

Since the induction axiom needed to determine the isomorphism class of ℕ depends on ℘(ℕ), the relativity of ℘ allows a diversity of non-isomorphic models of arithmetic respectively defined by the diverse models of any consistent first-order theory T which contains arithmetic. Many theories may be picked in the role of T, such as a set theory seen as a first-order theory, thus involving other notions than numbers (to define more sets of numbers than arithmetically defined classes). All we shall assume here is that it has a notion N of "natural number" with a constant 0, a function symbol S, and axioms making N a model of bare arithmetic (in any model of T), thus a bijective {0,S}-algebra.

A number in N is called standard if it belongs to the minimal subalgebra N0 = Min{0,S}N, isomorphic to ℕ by the unique {0,S}-morphism (embedding, nSNn(0N)) from ℕ to N. So, it represents an element of ℕ, which may be called a meta-number.
A model N of arithmetic is called standard if it is a minimal {0,S}-algebra, which can be equivalently written in 3 ways:

We shall abbreviate some works of first-order logic by set theoretical notations as follows :

Existence of non-standard models

The non-standardness of N is expressible by axioms on an additional structure, forming an extended theory T'. One way is to add If T is consistent then T' is also consistent (and cannot prove more), as any finite list of axioms of T' can be satisfied in any model of T (by interpreting k as a different number). Thus, T' also has a model, which is a model of T where N is non-standard.

This means that no consistent extension of arithmetic can ever force its model of arithmetic to be standard, i.e. require induction to apply to the full powerset of N, which it cannot ensure to exhaust by its classes. Candidate "standardness" predicates may be introduced but still cannot be forced to coincide with the true standardness (N0), which may stay a meta-set beyond all classes.

Any model of arithmetic obtained as described in the proof of the completeness theorem from any consistent extension of arithmetic, will necessarily be non-standard, as it cannot even be elementarily equivalent to ℕ according to the truth undefinability theorem. Skolem's paradox still holds in two ways: Non-standard models behave much like standard ones, as they satisfy all consequences of induction. Their precise range of diversity depends on the chosen theory of arithmetic.

Non-standard models of bare arithmetic

The only "addition" definable from S, is the meta-operation adding any n∈ℕ to any xN as x + n = Sn(x), that is the meta-sequence of functions (Sn)n∈ℕ each defined in the theory by a unary S-term n. The consequences of the induction schema in bare arithmetic can be summed up as As the restriction of S to the meta-set N\N0 of non-standard numbers of any model is a permutation, N\N0 is a -set. The last formula obliges this ℤ-set to be free. Conversely, the disjoint union of ℕ with any free ℤ-set, forms a model of bare arithmetic.
Applying both ways the definition of the order from the partial meta-operation of addition, gives two ordering results with non-standard numbers: Models of arithmetic with order, are formed as models of bare arithmetic with any choice of a total order on the partition of non-standard numbers in ℤ-orbits. Thus, bare arithmetic cannot suffice to define the order. But arithmetic with order cannot suffice to define addition either, as its non-standard models may either admit no or many corresponding interpretations of addition.

Non-standard models of Presburger Arithmetic

Such models satisfy all theorems of Presburger Arithmetic. In particular they have a well-defined total order, by which any non-standard number is greater than any standard number, and any non-empty class of numbers has a smallest element (which the meta-set of non-standard numbers hasn't).

Here are some more details only for the sake of illustration, which may be skipped.

There is also a meta-operation (sequence of functions) of multiplication of an xN by a standard number n∈ℕ : nx = x+...+x (with n occurrences of x)
Beyond commutativity, associativity and the seen properties of the order, the last independent consequence of the axiom schema of induction constraining non-standard models, is the possibility of Euclidean division by any nonzero standard number (generalizing the results on parity) :

d∈ℕ*, ∀xN, ∃qN, ∃r<d, x = dq + r

thanks to ∀d∈ℕ*, ∀qN, d⋅(q+1) = dq+d which is a schema of theorems in Presburger arithmetic.
Moreover this (q,r) is unique; q = x:dN is called the quotient and r is called the rest of the division of an xN by a d∈ℕ*, so that

qN, q=x:ddqx< d⋅(q+1) ⇔ ∃ r<d, x = dq + r.

The concept of model of Preburger arithmetic generated by a set (of non-standard numbers, since standard ones have no generating effect), is defined as that of subalgebra generated by a subset for the "algebra" (not exactly an algebra, but...) with language completed, like in the proof of the completeness theorem, by additional operation symbols reflecting all existence properties deduced from the axiom schema of induction:

They can be conceived either in the abstract (by evaluating relations arbitrarily, like in the proof of the completeness theorem) or as a subset of a given model (interpreting expressions there).
In particular, in any model of Preburger arithmetic generated by a single element (non-standard kN), the set of non-standard elements is the set of all values of expressions of the form
(ak):d +b
where a,d ∈ℕ* and b∈ℤ (the cases where a,d are relative primes suffice).
The predicate of divisibility of xN by a d∈ℕ*, is defined as the case when the rest cancels: d|x ⇔ (∃qN, x = dq)

The possible shapes of these models with respect to k (classes of isomorphisms preserving k, described using it), are classified by the sequence (rd)d∈ℕ* of rests of the division of k by all standard numbers d. The possible sequences are those which satisfy not only rd<d but also the compatibility formulas : n,d ∈ℕ*, ∃h, rdn = dh + rd (where in fact h<n). The simplest one is where all rd are 0, i.e. where k is divisible by every standard number (but the distinguishing property of this isomorphism class of models with 1 generator, "there exists a number divisible by every standard number", is inexpressible in Presburger arithmetic).

The non-standard models generated by 2 non-standard numbers k,k' can be split into the following classification depending on what may be intuitively described as the (standard) real number which k/k' is infinitely close to:

4.8. Developing theories : definitions

Given a theory T (with types, structures and axioms), a development of T is a extended theory T' = T∪(pack of extra components) which remains «equivalent» to T.  This concept has 3 aspects:
There are 3 levels of development with their respective schemes:

The use of a theorem as an available axiom for the proofs of the next theorems, abbreviates a repetition of its proof as a part of the next proofs (with possible substitutions of variables by terms to which the theorem is applied).
When inserting theories in the set theoretical framework, constructions become set theoretical definitions, which is why the distinction between definitions and constructions is not usually clarified in the literature.

The Galois connection (Mod,Tru)

For any language L, the truth relation between ground formulas and systems defines a Galois connection (Mod,Tru), where Mod gives the class of models of any set A of ground formulas taken as axioms, while Tru gives the set of true formulas which all given systems have in common. Replacing the class of all systems by the set of all L-structures on ℕ, leaves unchanged and thus gives definite sense to the closure Tru০Mod. All theorems of a theory belong to the closure Tru০Mod(A) of axioms, which is the condition for a formula to leave unchanged Mod A when it is added to A.

Schemes of definitions

We have 3 schemes of definitions:

Extending models by undefined structures

Thanks to the completeness theorem, if every model of T can be extended as a model of a T' = (T∪{R,A}) where R is a new symbol and A is a new (set of) axiom(s), then every ground formula of T provable in T', is also provable in T.
The converse may be wrong: a theory T may be extensible to a theory T' unable to prove more of its formulas, but to which not all its models are extensible. Here are some examples from non-standard models of arithmetic :

Definitions extend models

For any theory T with a model E, extensions T' of T by an additional axiom and/or structure symbol may have different effects: The uniqueness of the extension for every E, means that in the doubly extended theory (T∪{R,A,R1,A1}) where (R1,A1) is a copy of (R,A), one can prove

∀(variables), RR1

Definitions preserve sets of isomorphisms

For any isomorphism f between models E and F of T, with respective extensions as models E' and F' of a theory T' extending T by further structures and axioms, f may stay or not as an isomorphism between E' and F', i.e. preserve the interpretation of the new structure(s). It will be an isomorphism whenever the additional axiom(s) form a definition, determining the new structure(s) and thus making it invariant by isomorphisms. Thus, adding a defined structure to a theory preserves the sets of isomorphisms between its models.
In particular, any relation defined by an expression with language LR without parameters is invariant by automorphisms, so is in its closure Inv(AutL E). This means that adding it to L preserves AutL E.
As we saw how the switch from models to isomorphisms provides similarities between development levels from proofs to definitions, we shall now see it also providing similarities from definitions to constructions, making sense to how constructions also essentially preserve theories.

4.9. Constructions

Diverse construction schemes let new types play the roles of sets defined in set theory by operations between sets named by previous types: Each of the following schemes suffices to produce both others through the use of previous ones: The last one can be obtained by composing constructions of products (to collapse bound variables as one, and the same for parameters), a definition and a type of unary relations or functions (either as particular relations, or by adapting the form of axioms, replacing ⇔ by = ).

A development scheme at each level looks like a component at the next level

Despite their differences of nature, the 3 levels of development have remarkable similarities:

How constructions preserve isomorphisms

From any theory T, consider any extension T' = TXRA where X is a type, R is a pack of structures and A is a pack of axioms. The existence of an extension of a given model of T as a model of T', is a matter of A being small enough for these X and R. Its uniquess would not make direct sense as any identical copy would also fit. But extensions may still be qualified as «essentially unique» (being all copies of each other) giving sense to how constructions preserve models, depending on the set of isomorphisms between them. For any models E' and F' of T' respective extensions of models E and F of T, and any isomorphism f : EF, Indeed in this case, any 2 isomorphisms g,h from E' to F' with the same restriction to E, will be equal because
The automorphism h−1g of E' extended by IdX' forms an automorphism of the model (E'E'') of T''. As j is invariant, it stays equal to its image j০(h−1g)−1 by this automorphism. Therefore h−1g = IdE' , i.e. g=h.
What we can see here strictly preserved by constructions, is the class of isomorphisms as an abstract category. In particular, constructions leave unchanged the automorphism groups seen as abstract groups, providing more types on which they act.