4. Arithmetic and first-order foundations

4.1. Algebraic terms

Introduction

The concept of algebraic term, namely a term with a purely algebraic language L and a set V of variable symbols (no predicate, logical symbols nor binders), was first intuitively introduced in 1.5. We shall now formalize it as a kind of systems in a set theoretical framework. For convenience let us work with only one type (the generalization to many types is easy), and start with a wider class of systems we shall call L-drafts.
The formalization of this concept has a few possible variants.

In a first approach, an L-draft is an L-equational system (D, b) such that (Gr bD) is injective (where b: VDD and Gr bD = ∅). This injectivity means that

The set D means the set of occurrences of symbols, split between the set Im b of occurrences of variable symbols, and the set Im D of occurrences of symbols from L. But this definition for drafts would make each variable occur exactly once, thus cannot directly include algebraic terms, where a variable may have multiple occurrences or none, as particular cases. The solution is to first replace the use of b by that of its inverse b', function from a set of occurrences VDD to the set V of available variables, then remove from it the bijectivity requirement.
We may keep the injectivity requirement on b', meaning the uniqueness of occurrence of any variable, as any multiplicity of occurrences of one variable in a term can be replaced by the use of a single occurrence of it referenced multiple times inside the term (this replacement gives a synonymous "term" abstractly conceived, ignoring any care for visible notations).
But when multiple variables are available (∃≥2: V), we need to allow for non-surjective b' (terms leaving some available variables unused), especially to allow as a draft any term made of a single variable symbol.
Hence the relevance of a formalization where VD = DV and b' = IdVD.

Drafts

Finally, let us formalize an L-draft as an injective L-system (D, tGr ψD) where: Let us denote ψD = σ ⊓ l, i.e. ∀xOD, ψD(x) = (σ(x), lx) ∈ LD where σ∈LOD and lxDnσ(x).
Equivalent formulations of well-foundedness are

AOD, (∀xOD, Im lxAVxA) ⇒ A=OD
AD, (VDAD*(LA) ⊂A) ⇒ A = D
AD, VDAD ⇒ ∃xOD\A, Im lxA
AOD, A≠∅ ⇒ ∃xA, A∩ Im lx = ∅

A ground draft is a draft with no variable, i.e. VD = ∅. Thus, ψD: DLD and SubLD = {D}.
Variables in a draft may be reinterpreted as constants: extending ψD by IdVD : VDV, forms a ground (LV)-draft.

Sub-drafts and terms

Well-foundedness makes uninteresting the range of stable subsets of drafts, to be replaced by another stability concept : a subset AD will be called a sub-draft of D (or a co-stable subset of D) if, denoting OA = AOD and ψA = ψD|OA, we have (Im ψALA), i.e. ∀xOA, Im lxA.
Any sub-draft A of a draft D is a draft, i.e. fits the above axioms with fixed V, where the well-foundedness of A is obviously deduced from that of D by its last formulation.

Like with stable subsets, any intersection of sub-drafts is a sub-draft. Moreover, any union of sub-drafts is also a sub-draft (unlike for sub-algebras with an operation with arity >1, which from arguments in different sub-algebras may give a result escaping their union).

The sub-draft co-generated by a subset of a draft, is the intersection of all sub-drafts that include it. A term is a draft co-generated by a single element that is its root. Each element x of a draft D defines a term Tx with root x, sub-draft of D co-generated by {x}.

Each draft D is ordered by xyxTy.
This is obviously a preorder. Its antisymmetry, that is the uniqueness of the root of any term, can be proven as follows (a more elegant proof from more general concepts is in the study of Galois connections):

Categories of drafts

As particular relational systems, classes of L-drafts form concrete categories. Between two L-drafts D,E,

f ∈ MorL(D,E) ⇔ (f[OD]⊂OE ∧ ψEf|OD= Lf⚬ψD)

where the equality condition can be split as

σEf|OD = σD
xOD, lf(x) = flx.

There, coproducts are given by disjoint union, like with relational systems.

Another concrete category is that of drafts with variables-preserving morphisms, where V is fixed, and

MorL,V(D,E) = {f ∈ MorL(D,E) | f|VD = IdVD}.

This is roughly the category of ground (LV)-drafts (reinterpreting variables as constants, except that a multiply used variable may be figured in the latter as multiple elements, but only as a multiply targeted single element in the former).
The definition of coproducts is modified there as

L,ViI Di = (⋃iI VDi) ∪ ∐iI ODi

with the structure defined like before. In the binary case this coproduct between drafts D,E will be written DV E.
Products in these categories cannot be directly made of the products of underlying sets. The description of products in the variables-preserving case is left as an exercise once understood draft condensation (4.3).

Images of drafts

If D,E are L-drafts and f ∈ MorL(D,E) then

Intepretations of drafts in algebras

An interpretation of an L-draft D in an L-algebra E, is a morphism f∈MorL(D,E). Equivalently,

f|OD= φELf⚬ψD
xOD, f(x) = σ(x)E(flx)

Theorem. For any L-draft D and any L-algebra E,

uEV, ∃!h∈MorL(D,E), h|VD = u|VD.

Proof. For simplicity, let V = VD by restricting u.
Uniqueness was seen with equational systems, and will be used to prove existence :In other words, every L-algebra E is a module for every L-draft.

This formally justifies the concept of operation symbol defined by a term and interpreted as an operation in every L-algebra, by the concepts explained in 3.11 : the role of V-ary operation symbol played by any element s of an L-draft D (interpreted in each L-algebra E by ∀uEV, sE(u) = h(s) for the unique h∈MorL(D,E) such that h|VD = u|VD), preserved by all L-morphisms, is that of the operation symbol defined by the L-term with root s co-generated by s in D.

4.2. Quotient systems

Quotients of relational systems

Given a relational language L and two L-systems E and F, a quotient from E to F is a surjective morphism f : EF such that Lf[E] = F. We also say that the system F is a quotient of E by ∼f.
This condition defines F from E and f, as the smallest L-structure on F such that f ∈ MorL(E, F).

For any equivalence relation R on E, the quotient set E/R receives the L-structure E/R defined as LR[E], making (R) a quotient from E to E/R.
Then for any h : EX,

R ⊂ ∼h ⇒ (h ∈ MorL(E,X) ⇔ h/R ∈ MorL(E/R,X)).

Now if L is an algebraic language, the condition for an f : EF between L-systems to be a quotient, is written

{(Lf(x),f(y)) | (x,y) ∈ E} = F.

It implies ∀BF, B ∈ SubLFf *(B) ∈ SubLE.
Any morphism from a serial system to a partial algebra is a quotient to its image, which is a stable algebra.

Quotients in concrete categories

The concept of quotient has an equivalent definition expressible for any concrete category C, similar to the concept of embedding with sides reversed. This can be explained by being the same concept as embedding in the opposite category, once the concreteness of the category is re-expressed as a link with the category of sets, independently of the details of the category of sets, which can thus be replaced by any other category, namely the opposite category to the category of sets.

So, a quotient from E to F in C is a surjective morphism f : EF such that, ∀CX, equivalently

Mor(F,X) = {gXF | gf ∈ Mor(E,X)}
Mor(F,X) = {h/f | h ∈ Mor(E,X) ∧ ∼f ⊂ ∼h}
{h ∈ Mor(E,X) | ∼f ⊂ ∼h} = {gf | g∈Mor(F,X)}

Any retraction is a quotient.

Congruences

Given an algebraic language L, a congruence on an L-system E is an equivalence relation R on E such that, equivalently, Proof of equivalence. Thus, the quotient of any algebra or other serial system by a congruence, is an algebra. In this case, a congruence can be essentially described as a binary relation satisfying the axioms of equality.

Quotients of modules

Not all quotients of modules are modules. But for any algebraic language L, any L-system E, and any L-equational system (K, b), where b : VK and ACV holds, if E is a b-module (where V is structureless) then its quotient by any congruence is also a b-module.
Proof. While this proof only needs f to be a surjective morphism, not precisely a quotient, in practice this will only be used with quotients. Indeed, f is a quotient for the smallest structure of F which makes f a morphism; greater structures preserve existence; to ensure uniqueness we need F to still be a partial algebra, but when E is left serial this implies that f is a quotient anyway.
Actually the ACV hypothesis is not needed, as infinite equational systems will be seen equivalent to sets of finite ones, where finite choice applies.

Intersections of congruences

On any L-system E, any intersection of equivalence relations is an equivalence relation ; and any intersection of congruences is a congruence.
This can be seen as a case of intersection of stable subsets, as equivalence relations are the stable relations by some structure, and the same for congruences. But here is another viewpoint on this:

Let ∀iI, fi ∈ MorL(E,Fi), P = ∏iI Fi and g = ⊓iI fi ∈ MorL(E, P). Then

g = ⋂iIfi.

If all Fi are partial algebras then P is also a partial algebra, which explains that ∼g is a congruence.
Let G = Im gP. Like any morphism, g will be a quotient to its image G if E is serial and P is a partial algebra, but not necessarily otherwise.
Let ∀iI, hi = πi|G ∈ MorL(G,Fi). Then fi = hig. As with any composite of morphisms, if fi is a quotient then hi is also a quotient.

Minimal congruences

The minimal congruence of any L-system E is the intersection of all its congruences. Let us write it ≃E, or simply ≃ when E is given by context. It is equality if and only if E is a partial algebra.
Let us call condensate of E its quotient by its minimal congruence, and πE the natural surjection to it:

E = E/≃
πE = ≃E : E ↠ ⤓E

(⤓E, πE) is initial in the category of (X,f) where X is a partial L-algebra, f ∈ MorL(E,X), and

Mor((X,f),(Y,g)) = {h ∈ MorL(X,Y) | hf = g}
Mor((⤓E, πE),(Y,g)) = {gE}

The congruence of E generated by any equivalence relation R is the preimage of ≃E/R by R.

In the category of partial algebras, the coproduct is given by the condensate of the disjoint union.

Condensates of injective systems

For any injective L-system (E, tGr ψE), its minimal congruence ≃E is 〈𝛿EL, and ⤓E is injective.

Proof:

The formula R = 𝛿EF*(LR) is equivalent to

R ∈ MorL(E,℘(E)) ∧ ∀xE\Dom ψE, R(x) = {x}
where φ℘(E)(s,u) = {xOD | σ(x)=slx∈∏u}

which determines R when E is a draft.
The formula (*) expresses R ⊂ 𝛿EF*(LR), and is equivalent to the conjunction of
  1. ∀(x,y)∈R, x ∉ Dom ψEx = y
  2. ∀(x,y)∈Dom ψE, xRy ⇒ (xσ y ∧ Im(lx×ly) ⊂ R).
If R is an equivalence relation, 2. says the structure of E/R is injective, while 1. means it has the same variables as E. Thus in a draft, the equivalence relations which fit (*) are those of variables-preserving morphisms to some other draft.
Therefore, the minimal congruence of a draft is its only congruence by which the quotient is a (functional) draft with the same variables ; it is the preimage of = by any variables-preserving morphism to a functional draft.

4.3. Term algebras

Synonymous terms

In any draft, xy is the condition for the terms with roots x and y to be synonymous, i.e. taking the same value f(x) = f(y) for any interpretation f of the draft in a partial algebra.

Terms in different drafts D, E with the same variables, can be so compared by interpreting ≃ in their coproduct DV E. Equivalently, for any functional draft F,

f∈MorL,V(D,F), ∀g∈MorL,V(E,F), ∀xD, ∀yE, xyf(x) = g(y).

This equivalence comes from !: MorL,V(DV E, F) ⇌ MorL,V(D,F) × MorL,V(E,F). Then,

h∈MorL,V(D,E), ∀x,yD, h(x) ≃ x ∧ (xyh(x) ≃ h(y))

because ≃Eh ∈ MorL,V(D, ).
Thus ∀x,yD, the meaning of xy is the same for any sub-drafts of D which x and y are respectively considered elements of, such as the terms Tx and Ty (using h∈MorL,V(TxV Ty, D)).

Term algebras

For any set given set V, a V-ary term L-algebra is an L-system equivalently described as:
  1. An algebraic L-draft D where VD = V;
  2. An injective L-algebra D generated by V where V ∩ Im φD= ∅ (thus D\Im φD = V);
  3. A final L-draft in the sense of variables-preserving morphisms MorL,V ;
  4. An L-algebra with basis V.
As any draft, it is qualified as ground if V = ∅, i.e. (by definition 2.) a ground term L-algebra is an algebra both minimal and injective, and thus also bijective; 4. means it is an initial L-algebra.
Intuitively, a V-ary term L-algebra T represents the set of all V-ary operation symbols possibly defined by L-terms ; any V-ary sL has a copy sT(IdV) ∈ T.

Proof of equivalence. Already 1. ⇔ 2. ⇒ (3. ∧ 4.).
Proof of 3. ⇒ 1. : any final L-draft D is

Let us prove 4. ⇒ 2. after reduction to the ground case by replacing variables by constants (replacing L by LV).
Any initial L-algebra (EE) is :

Term algebras in injective algebras

For any injective L-algebra (EE), any VE \ Im φE, generates a term algebra 〈VLE. In particular the existence of an injective algebra implies that of a ground term algebra.

For any f∈MorL(K,E) from a ground draft K to an injective system E, the system Im f is both injective and minimal, i.e. a ground draft. As with any morphism, if K is serial then Im f is serial; if E is functional then Im f is functional. Thus, if K is a ground term algebra and E is an injective partial algebra then Im f is a ground term algebra, and thus f is injective (by essential uniqueness of initial or final objects).

Precisely to distinguish statements and their respective ranges of validity, Some more complicated statements can be obtained in non-ground cases by applying the above to the language LV.

Particular languages

If L has no constant then ∅ is a ground term L-algebra.
If L only has constants, then ground term L-algebras are the copies of L.
If L only has symbols with arity 0 or 1 then every L-term is functional (this may not be obvious ; we shall not use it later).

Trajectories by transformation sets

For any set of transformations LEE, seen as a set of function symbols interpreted in E,

xE, 〈{x}〉L = {f(x)|f∈〈L{Id,⚬}}
XE, 〈XL = ⋃f∈〈L{Id,⚬} f[X] = ⋃xX 〈{x}〉L

Denoting A = 〈XL, M = 〈L{Id,⚬} and K = {f(x) | (f,x)∈M×X}, the claim is that A = K. As we shall formalize later, both A and K mean the set of all composites of any number of functions in L, applied to any xX.
Proof of AK Proof of KA

Free monoids

A free monoid is a monoid with a basis in the category of monoids. Then its basis is generating (for the same reason as the above case of categories of algebras).

Any unary term L-algebra M (if it exists) is an egg in the category of L-algebras, thus a monoid acting on all L-algebras with MorL ⊂ MorM.

If L is a language of function symbols and M is a unary term L-algebra then L is a basis of M in the category of monoids.
Proof:

Conversely, if L is a basis of a monoid M in the category of monoids, then M is a unary term L-algebra.
Proof:

4.4. Integers and recursion

The set ℕ

An arithmetic is any theory describing the system ℕ of natural numbers. There are diverse such theories, depending on the choice of a logical framework, then of a language and axioms. Somehow, the main framework for arithmetic is second-order logic, but this leads to other versions depending on how second-order logic itself comes down to other frameworks.
Let us start with the set theoretical arithmetic, most precise arithmetic expressed in the framework of a strong enough set theory, interpreting second-order quantifiers by the powerset symbol. This defines "the standard ℕ", or rather its isomorphism class relative to the given universe (it amounts to defining the class of "finite sets", which are measurable by their number of elements, as we shall define in 4.6).

Definition. The set ℕ of natural numbers is a unary term {S}-algebra where S is a function symbol called the successor ; also expressible as a ground term {0,S}-algebra where 0 (zero) is a constant symbol.

The existence of such a system is our first expression of the axiom of infinity. Let us formalize it by inserting arithmetic into set theory, in the form of 3 constant symbols ℕ, 0, S with the following axioms :

0 ∈ ℕ ∧ S : ℕ → ℕ : ℕ is a {0,S}-algebra
(H0) n∈ℕ, Sn ≠ 0 : 0 ∉ Im S
(Inj) n,p∈ℕ, Sn = Sp n = p : S is injective
(Ind) A⊂ℕ, (0∈A ∧ ∀nA, SnA) ⇒ A = ℕ : 0 generates ℕ (induction axiom).

More constant symbols can be defined from there:
1 = S0
2 = S1 = SS0
and so on.

This completes our axioms of set theory up to the strength of Mc Lane set theory, which essentially forms (with the axiom of choice) the most commonly used foundational theory for mathematics. It will imply the existence of term algebras with any language.
This definition of ℕ may be criticized as circular, for involving concepts of algebra we initially introduced using ℕ as set of possible arities of symbols. But these concepts may be re-written up to this point without this use, by restriction to the use of symbols with small arities (0, 1, 2).

Recursively defined sequences

Each n∈ℕ represents a function symbol Sn defined by a unary S-term and interpreted in each S-algebra (E,f) as a function f nEE.

A sequence of elements of a set E, is a family u : ℕ → E. It is called recursive when it is a morphism for a given S-algebra structure on E, i.e. defined by choices of aE and fEE :

u ∈ Mor{0,S}(ℕ, (E,(a,f))) ⇔ (∀n∈ℕ, un = f n(a))
⇔ (u0 = a ∧ ∀n∈ℕ, uSn = f(un)).

Intuitively, each f n(a) abbreviates the term with n occurrences of the function symbol f, and variable a :
f 0(a) = a
f 1(a) = f(a)
f 2(a) = f(f(a))

The sequence (f n) is itself recursively defined by f 0 = IdE ∧ ∀n∈ℕ, f Sn = ff n.
In particular, f 1 = f and f 2 = ff.
Generally for any fEE, gEX, the recursive sequence h : ℕ → EX defined by h0 = g ∧ ∀n∈ℕ, hSn = fhn, is hn = f ng.

Addition

The basic properties of addition and multiplication in ℕ can be seen as obvious consequences of the role of natural numbers as measures ("cardinals") of finite sets, which will be defined in 4.6. But let us prove them independently here as an exercise.
Like any unary term algebra, ℕ forms a monoid (ℕ, 0, +), whose actions are the sequences (f n) for any transformation f.
It is commutative as it is generated by the singleton {1} (which commutes with itself). Thus the side won't matter, but conventions basically present it as acting on the right, defining addition as n+p = Sp(n), or recursively as

n + 0 = n
p∈ℕ, n+S(p) = S(n+p)
n+1 = S(n+0) = Sn

The action axiom ∀fEE, ∀n,p∈ℕ, f n+p = f pf n is deducible like for any term algebra:
(f n+0 = f n ∧ ∀p∈ℕ, f n+Sp = f S(n+p) = ff n+p) ∴ ∀aE, (pf n+p(a)) ∈ Mor(ℕ,(E, f n(a),f)).
The associativity of + (coming as a property of monoids) will be used implicitly, omitting parenthesis.

Inversed recursion and integers

Let -ℕ = {-n | n∈ℕ} (where -n is read "minus n") a copy of ℕ, structured as a partial {0,S}-algebra where 0-ℕ = -0 and S-ℕ is the copy of S-1. As a relational system, it works like ℕ with S transposed. Thus, it has a unique {0,S}-morphism to any {0,S}-system E where 0E is a singleton and SE is both injective and surjective. In particular this holds for {0,S}-algebras (E,(a,f)) where f = SE ∈ 𝔖E, forming the negative recursion

(f n(a))n∈-ℕ ∈ Mor{0,S}(-ℕ, (E,(a,f)))
n∈ℕ, f -n = (f -1)n

giving back the notation f -1 for the inverse of f as a particular case.
Let us define the system of integers as ℤ = ℕ ∪ -ℕ assuming ℕ ⋂ -ℕ = {0} = {-0}, with the union of structures from ℕ and -ℕ. It is easy to see that S ∈ 𝔖, and that ℤ is a {0,S}-algebra and the coproduct of ℕ and -ℕ in the category of partial {0,S}-algebras.
Thus, (ℤ,0) is an egg in the category of {S}-algebras (E,f) where f ∈ 𝔖E. This defines a monoid structure (0,+) on ℤ, extending the one on ℕ, and acting on each such (E,f) by the unique (f n)n∈ℤ ∈ Mor{0,S}(ℤ, (EE, IdE, ⚬(f)). It is also the unique morphism of monoid from ℤ to EE sending 1 to f.
Moreover,
  1. + is commutative.
  2. n∈ℕ, (-n)+n = 0 = n+(-n), thus ∀f ∈ 𝔖E, {f n | n∈ℤ} ⊂ 𝔖E.
Proof of 1. : ℤ is generated by {-1, 1}, while (-1) + 1 = 1 + (-1) = 0 because f ∈ 𝔖Ef -1f = ff -1 = IdE.

Proof of 2. : from ∀n∈ℕ, n+1 = 1+n comes

fEE, f Sn = ff n = f nf
f,gEE, gf = IdE ⇒ ∀n∈ℕ, gnf n = IdE
f∈𝔖E, ∀n∈ℕ, f n∈𝔖E ∧ (f n)-1 = f -n

Each result can also be deduced from the other:
1.⇒2. : by symmetry between ℕ and -ℕ, (-n)+n ∈ ℕ ⇔ n+(-n) ∈ -ℕ.
2.⇒1. from the commutativity of + and a previous result.

Following the French terminology, the subsets ℕ and -ℕ of ℤ will be respectively called the sets of positive and negative integers.

Multiplication

The following definition and properties of multiplication in ℕ also hold in ℤ (where it is an extension of the one in ℕ).
Let us define it as np = (Sp)n(0) (this choice of sides fits common language, unlike the usual one from the literature, until the proof of commutativity removes the distinction). Recursively, the latter coming as (Sx)Sy(0) = Sx((Sx)y(0)).
Then generally, ∀fEE, f np = (f p)n. Proof : n∈ℕ, n⋅0 = 0 (easy to check by induction).
Right distributivity ∀x,y,z∈ℕ, (x+y)⋅z = xz + yz comes by induction on y, or as (Sz)x+y = (Sz)y⚬(Sz)x.
Left distributivity ∀x,y,z∈ℕ, x⋅(y+z) = xy + xz comes by induction on x using commutativity of +.
In particular ∀x,y∈ℕ, xSy = (xy)+x. As the transpose of ⋅ fits the recursive definition of ⋅, both are equal : multiplication is commutative.

An elegant approach to multiplication is to see (ℕ, 1, ⋅) as the monoid End{0, +} ℕ. It will be developed later.

4.5. Presburger Arithmetic

First-order theories of arithmetic

After the formalization of ℕ in set theory, let us review theories of first-order arithmetic, which means first-order theories describing ℕ as their only type, and interpreting induction as a schema of axioms by second-order universal elimination (thus with A only ranging over classes of that theory). First-order theories with 2 types ℕ and ℘(ℕ), called second-order arithmetic, will be studied in Part 5.

Unlike the set theoretical framework, first-order logic cannot define recursive sequences for arithmetic from the mere successor function. In particular it cannot proceed the recursive definitions of addition and multiplication, as will be clear from the description of non-standard models (4.9). So, different choices of language give non-equivalent versions of first-order arithmetic, with non-equivalent versions of the schema of induction:

Presburger arithmetic

Its usual presentations have primitive symbols 0, +, and either 1 or S which are definable from each other (1 = S0 and Sx = x+1) ; actually + would suffice to define all.
Among its equivalent formalizations, a very short one which superficially looks weakest, describes the set ℕ* of nonzero natural numbers, with symbols 1 (constant) and + (binary operation), and axioms :
x,y∈ℕ*, x + (y+1) = (x+y) + 1 (A1) : + is associative on 1
A⊂ℕ*, (1∈A ∧ ∀x,yA, x+yA) ⇒ A=ℕ* (Min)
x,y∈ℕ*, x + y y (F)

The axiom (Min), expressed for simplicity in set theory where it makes 1 a generator of (ℕ*,+), is then reduced to a first-order axiom schema by second-order universal elimination. It will be seen equivalent to the axiom of induction, both set theoretically and as axiom schemas.

(A1) is a particular case of the associativity axiom (As).

Conversely, (A1 ∧ Min) ⇒ (As) : (As ∧ Min) ⇒ commutativity, as deduced from 1∈C({1}).

Now take ℕ = ℕ*∪{0} where 0∉ℕ*, to which + is extended as ∀n∈ℕ, 0+n = n+0 = n. This extension is still commutative and associative.
With S defined as xx+1, obviously (H0) holds. As x+0 = x ≠ 0,

(F) ⇔ (∀x∈ℕ*, ∀y∈ℕ, x+y y)

(Inj ∧ Ind ∧ A1) ⇒ (F) because ∀x∈ℕ*, {y∈ℕ | x+yy} ∈ Sub{0,S}ℕ .
The proof of the converse (F ⇒ Inj) using other axioms, will come from the study of the order relation.

Equivalence of axiom schemas

Let (Ind1) be the statement ∀A⊂ℕ*, (1∈A ∧ (∀xA, x+1 ∈A)) ⇒ A=ℕ*.
(Ind1) ⇒ (Ind) and (Ind1) ⇒ (Min) are obvious
(Ind) ⇒ (Ind1) either by using A∪{0}, or {x∈ℕ|x+1∈A} once noted that (Ind) ⇒ Im S = ℕ*.
Proof of (A1 ∧ Min) ⇒ (Ind) in set theory (ignoring our previous definition of ℕ): Proof of (As ∧ Min) ⇒ (Ind) directly convertible to first-order logic: So, all possible axiom pairs are equivalent: (A1 ∧ Min) ⇔ (As ∧ Min) ⇔ (As ∧ Ind) ⇔ (A1 ∧ Ind), and imply commutativity.

The order relation

In Presburger arithmetic, let us define binary relations ≤ and < as

x<y ⇔ ∃z∈ℕ*, y = x+z
xy ⇔ ∃z∈ℕ, y = x+z

(A1 ∧ Ind) implies
  1. < is transitive
  2. x,y∈ℕ, xy ⇔ (x<yx=y)
  3. x∈ℕ*,∃y∈ℕ, x = y + 1 (ℕ is a surjective {0,S}-algebra)
  4. x,y∈ℕ, x<yx+1≤y
  5. A⊂ℕ, A≠∅ ⇒ ∃xA, ∀yA, xy (meaning a schema of formulas)
  6. x,y∈ℕ, xyyx
  7. x,y,z∈ℕ, x<yx+z < y+z
Proofs :
  1. using (As), x < y < z ⇒ (∃n,p∈ℕ*, z = y+p = x+n+p) ⇒ x < z.
  2. obvious from definitions;
  3. by (Ind)
  4. from 3.
  5. xy ⇒ (x+1≤yx=y)
    0∈{x∈ℕ |∀yA, xy}=B
    xB, x+1∈BxA
    AB = ∅ ⇒ (B = ℕ ∴ A = AB = ∅)
  6. from 5. with A={x,y}. Or using 3.,
    A = {x∈ℕ | ∀y∈ℕ, x<yx=y y<x} ⇒ (0∈A ∧ ∀xA, x+1 ∈ A)
  7. y = x+ny+z = x+z+n
(A1 ∧ Ind ∧ F) implies Proof. (F) gives irreflexivity ; other conditions for a strict total order come from 1. and 6.
There must be one true formula among (x<y), (x = y), (y<x), which by 7. respectively imply (x+z<y+z), (x+z = y+z), (y+z<x+z). But only one of the latter can be true, thus each implication must be an equivalence. Cancellativity on one side extends to the other side by commutativity. ∎

The smallest element of any nonempty A ⊂ ℕ is denoted min A.

In the set theoretical ℕ, this order coincides with the order between terms with either language {S} or {0,S}. (The proof is easy).
Also, for each n ∈ℕ, the set <(n) ⊂ ℕ can be seen as recursively defined by
¬(n<0) ∧ ∀x∈ℕ, n < Sx ⇔ (x = nn < x)
which can also be written
x∈ℕ, x ∈ ≤(n) ⇔ (x = n ∨ ∃y∈ ≤(n), Sy=x).
(Its existence in ℘(ℕ) can be deduced by induction on n, more simply than other recursions)

Parity

Presburger arithmetic can qualify a number n∈ℕ as even by ∃x∈ℕ, n = x+x; and odd by ∃x∈ℕ, n=x+x+1.
Obviously, if n is even then n+1 is odd; if n is odd then n+1 is even thanks to commutativity. Then it can be shown by (Ind) that any n must be either even or odd, and by also using (Inj) that it cannot be both and that this x is unique.

Taken as primitive with a relevant set of axioms, the order relation can form a variant of first-order arithmetic more expressive than bare arithmetic, but less than Presburger arithmetic, as it suffices to define S but cannot suffice to define + (nor parity) and cannot be defined from S in first-order logic.

Images of recursive sequences

In set theory with axiom of infinity, the image of a recursive sequence u = (f n(a))n∈ℕ by a transformation fXX is a trajectory Y = 〈af = Im u of an action of (ℕ,0,+) on E.
As + is commutative, Y gets a monoid structure with identity a and generator f(a), that is a quotient of ℕ.

The injectivity of u (that would make it an isomorphism from ℕ to Y) is equivalent to the injectivity (H0) ∧ (Inj) of the {0,S}-algebra structure of Y.
If u is not injective, let y = min{z∈ℕ | ∃x<z, ux = uz}. Then the restriction of u to <(y) is bijective to Y (its image is Y because it is {0,S}-stable). Then there is a unique x<y such that ux = uy, and this data of x and y determine the isomorphism class of Y.

A cycle of a transformation fXX, is a non-free trajectory Y = 〈af for some aX, such that, equivalently (using the above defined x and y),

Then this y, independent of the chosen generator a of Y, is called the period of Y.
If f is a permutation then every cycle of f is also a cycle of f -1 with the same period.

4.6. Finiteness and countability

To review equivalent forms of the axiom of infinity, we need to work without assuming it.

Cardinals

A set A is said to be smaller (or of smaller size or of smaller cardinality) than a set B, if there exists an injection from A to B. This predicate is a preorder on the class of sets (as injections form a category on the class of sets).
By the Cantor-Bernstein theorem, the equivalence predicate it defines on the class of sets coincides with the predicate ≈ of equinumerosity between sets, defined by the existence of a bijection : AB ⇔ ∃fBA, f : AB.
Equinumerous sets are said to "have the same cardinal". The cardinal |A| of any set A, intuitively meant as the equivalence class of A by ≈ in the class of all sets, can be formalized for the modest use we shall make of that concept, as A itself where any direct or indirect use of "equality" between cardinals (never mixed with other elements) actually refers to the use of ≈. This practice is a particular case of development of a theory (here, set theory), as will be detailed in 4.11.

The {0,S}-structure on powersets

Let us give to the class of sets the meta-structure with language Σ = {0,S} first introduced to define ℕ, defined by

0 = {∅}
S(A,B) ⇔ ∃xB, A = B\{x}.

By restriction, this gives a Σ-structure to the powerset ℘(E) of any set E.

The quotient Σ-system KE = ℘(E)/≈ is functional, surjective and injective, and Dom SKE ∪ {|E|} = KE.

If FE then the natural injection from KF to KE is a Σ-embedding (like any injective morphism from a surjective system to an injective one). Let us abusively treat these natural injections as inclusions (KFKE), which means to let these sets play the role of sets of cardinals. For any family of considered sets, seeing them as subsets of their union lets us proceed the study of their cardinals by this tool.

Finite cardinals

For any set E, the set MinΣ KE is called the set of finite cardinals from KE.
This predicate of finiteness for cardinals in KE is independent of E. Proof:

Any infinite cardinal is greater than any finite one. Proof:

Axiom of infinity

In set theory with the powerset, the following statements are equivalent expressions of the axiom of infinity:
  1. There exists an injective algebra whose language has at least a constant and a non-constant.
  2. There exists a non-surjective injection of a set to itself; equivalently an injective {0,S}-algebra, or a set ℕ.
  3. There exists an infinite set.
  4. For any countable algebraic language there is a (countable) injective algebra
  5. Any consistent first-order theory with a countable language has a (countable) model

4. ⇒ 2. ⇒ 1.; for 1. ⇒ 2. : from a non-constant operation s there, xs(kx) is an injective, non-surjective transformation.

We shall prove 2. ⇔ 3. below (finite cardinalities), and 2. ⇒ 4. ⇒ 5. in the section on the completeness theorem.

We may regard 5. ⇒ 4. as relative to the natural assumption that the concept of injective algebra forms a consistent theory. This theory is essentially the part of set theory which describes tuples. So, if a language L is included in a model of that simple part of set theory, then there exists an injective L-algebra. If such a model U is standard, this takes the form LUU, making (U, IdLU) an injective L-algebra. ∎
The infinity of ℕ easily implies the uniqueness of the number of elements of a finite set, as its non-uniqueness would give that set a strict injection into itself, and thus an injection of ℕ into it.
2. ⇒ 3. will come from the infinity of ℕ.
For 3. ⇒ 2. we shall pick a different set, as the existence of an injective {0,S}-algebra structure on any infinite set would need the axiom of choice.

More properties of finite sets

Between finite cardinals, the order as cardinals coincides with the natural order in ℕ.

If E is finite, e∈ Min KE, and KE is a ground term with root e representing the number of elements of E
The set of finite subsets of E can be equivalently defined as MinΣ ℘(E). A set is finite if it is a finite subset of itself (otherwise it is infinite).
For any sets EF, (E is finite) ⇔ (E is a finite subset of F).

Equivalently, a set E is finite if it has a bijection to the graph of S in a ground Σ-term n (which can be seen as a unary {S}-term). This may as well be seen as a bijection to its domain {0,...,n-1} or to its image {1,...,n}.

Assuming the axiom of infinity, the finiteness of a set E is equivalent to the existence of a number n∈ℕ and a bijection from Vn = {x∈ℕ | x<n} to E; or a surjection from Vn to E, which is then bijective precisely when n takes its smallest value (the number of elements of E).

Without ℕ, either ℘(℘(E)) or (with more work) EE can be used to express the finiteness of a set E, and to rebuild ℕ from E if E is infinite.

Any left cancellative element of a finite monoid is invertible. Thus any finite cancellative monoid is a group.
Proof: it acts as an injective transformation of a finite set, thus a permutation, by which the preimage of e is a right inverse, thus an inverse.

Countable sets

It is easy to check that any condensed ground Σ-draft is either ∅ or a ground Σ-term or a ground term Σ-algebra (only two other languages have this property, apart from those with no constant).
A set is called countable if, equivalently,

  1. It has an injection to a condensed ground Σ-draft
  2. It has a bijection with a condensed ground Σ-draft
  3. A condensed ground Σ-draft has a surjection to it
1.⇒2. by restricting the order, once condensed ground {0,S}-drafts are described from their order structure; another method is given below.
2.⇒(1.∧ 3.) ; 3.⇒1. without AC, by picking the smallest element of each preimage.

So, countable sets are either finite or in bijection with ℕ. The latter, on which there exists a possible structure of ground term {0,S}-algebra, will be called countably infinite. However we still need to prove that they are indeed infinite.

Countability of ℕ×ℕ
y\x
0
1
2
3
4
0
0
1
4
9
16
1
2
3
5
10
17
2
6
7
8
11
18
3
12
13
14
15
19

A natural bijection between ℕ×ℕ and ℕ is defined by (x,y) ↦ (y < x ? xx + y : yy + x + y).
y\x
0
1
2
3
4
0
0
1
3
6
10
1
2
4
7
11
2
5
8
12
3
9
13
Another bijection b2 : ℕ×ℕ → ℕ is defined as b2(x,y) = Tx+y+y from the sequence (Tn) of triangular numbers defined by 2⋅Tn = n⋅(n+1) (see properties of parity) or equivalently by T0=0 ∧ ∀n∈ℕ, Tn+1=Tn+n+1. This makes ℕ×ℕ a ground term (0,S)-algebra where
0=(0,0)
S(0,y)=(y+1,0)
S(x+1,y)=(x,y+1)
which recursively defines b2-1(z), where x+y is the only n such that Tnz<Tn+1.

Countability of finite sequences of integers

If there is a surjection f from ℕ to a set E then E is countable, as an inverse injection can be defined by associating to each xE its smallest preimage by f.

Fixing a bijection b2 from ℕ2 to ℕ, for any integer n>0 we can recursively define a bijection bn:ℕn→ℕ, for example:
b1=Id
n>0,∀u∈ℕn, bSn(u) = b2(bn(u),un)
(which gives back b2 as the particular case of bn for n=2).
Also the set ℕ(ℕ) = {u∈ ℕ | ∃k∈ ℕ, ∀n>k, un=0} is countable. There are different ways to see it:

But this construction of ℕ(ℕ)) and similarly simple candidate expressions of H as a countable set, assumes recursion, thus cannot be used to define recursion by the above method. To actually provide a definition of recursion that does not assume it, requires a different expression of an enumerated H. The representation of sequences which was used by Godel as a basis to define recursion, actually forms a different set.

4.7. The Completeness Theorem

Existence of countable term algebras

For any countable algebraic language L with at least a constant and a non-constant, finding a countable ground term L-algebra (necessarily infinite) amonts to defining a ground term L-algebra structure on ℕ, which is a bijection from Lℕ to ℕ.
Splitting L as CD where C is the set of constants, and D is the rest of symbols, we have Lℕ = C∪(Dℕ). But C and D are countable (either finite or infinite), and Dℕ is a union over D of disjoint sets with explicitly definable bijections with ℕ (such as the bns we saw). In any case, a bijection between C∪(Dℕ) and ℕ can be found without problem.
In practice, such bijections from Lℕ to ℕ happen to be ground term L-algebra structures because ∀(s,x)∈Dℕ, ∀i<ns, xi<s(x) which would be contradicted by the smallest element outside MinL ℕ. ∎

Interpretation of first-order formulas

Trying to extend the formal construction of the interpretation of terms in algebras, to the case of formulas interpreted in systems, the difficulty is to cope with the interpretation of quantifiers (or generally binders, if we wish to still generalize).
A possibility, is to switch to a view over all variables as bound throughout the formula: from the concept of interpretation hvET of a term T in an algebra E for every interpretation v of its set V of variables, the family (hv)vEV is re-curried into a function from T to EEV. So, a first-order formula interpreted in a system E can be understood as a term interpreted in an algebra with maybe two base types : the sets OpE and RelE of all operations and all relations in E; or split as two sequences of types, OpE(n) and RelE(n). We might not need the full sets of these, but at least, an algebra of these (a subset stable by all needed logical operations).
We took unions over ℕ for the case we would need to see "all possible formulas" as terms interpreted in one same algebra.

The Completeness Theorem

Let us finally prove the Completeness theorem previously commented in 1.1, 1.9, 1.10, 1.B and 1.C.

Theorem. First-order logic has a proving system both sound and complete in the following equivalent senses

Sketch of proof of the first statement (implicitly suggesting a possible form of a proving system, but ignoring efficiency issues; the axiom of choice is not used and is anyway out of topic):

Reduce T to a single-type theory T1 simulating the use of types by relevant unary predicate symbols and axioms, but without requiring all objects to belong to some type. Thus, the axiom (∃x, 1) can be added to T1 without harm (it still allows types to be empty).
Then re-write axioms of T1 in prenex form, that is chains of quantifiers applied to quantifier-free formulas (using equivalences from 2.5 and 2.6 simplified for a nonempty model).
Interpret the symbol = in axioms as an ordinary predicate symbol, with the axioms of equality [1.11].

Replace each occurrence of ∃ in an axiom by the use of an additional operator symbol K, for example

(∀x,x', ∃y, ∀z, A(x,x',y,z)) ⇢ (∀x,x',∀z, A(x,x',K(x,x'), z)).

Let M be the ground term algebra over the language of operator symbols enriched as just described. Reinterpreting all ∀ in axioms as the use of axiom schemas (one axiom for each replacement of variables by a tuple of elements of M), gives a propositional theory T2 (its axioms are composed of logical connectives over Boolean variables), whose set of Boolean variables is PM where P is the set of predicate symbols in the theory plus the "equality" symbol.
Observe that T2 is still consistent, as any contradiction in T2 (= finite set of axioms not satisfied by any tuple of Boolean values of their variables) would provide a contradiction in T1, as follows:
From all subterms occurring in used axioms of T2, list without repetition all those whose root is a symbol S which was added to the language when converting some axiom ∀x,∃y,∀z,∃u, A(x,y,z,u) of T1, into the axiom ∀x,z, A(x,K(x),z, S(x,z)) in T2. Successively replace them all by variables in T1: for terms S(t0,t1) and S(t0,t2) (where t0, t1, t2M), the reasoning in T1 will say
"Let y such that ∀z,∃u, A(t0,y,z,u);
let u1 such that A(t0,y,t1,u1);
let u2 such that A(t0,y,t2,u2);"
and use the contradiction in T2 where y replaces K(t0), u1 replaces S(t0,t1) and u2 replaces S(t0,t2).
Recursively by a chosen enumeration of PM as a countable set, add one by one to the axioms, each of these Boolean variables if it is consistent with previously accepted axioms, so that all get values without contradiction.
Then the quotient of M by the equivalence relation of "equality", forms a model (with the "true equality"). ∎

As this construction depends on an arbitrary order between formulas, different choices give different models, which may be non-isomorphic and even have different properties reflecting the undecidability of the theory's formulas.

Deduction of the second statement from the first :
TA
T ∪ {¬A} is inconsistent
T ∪ {¬A} has no countable model
A is true in all countable models of T.  ∎

Skolem's Paradox

Comparing the results of the completeness theorem with Cantor's theorem gives this paradox :
If set theory is consistent then it has countable models, though it sees some sets there, such as ℘(ℕ), as uncountable. Rigorously speaking, when the powerset axiom says that the class of subsets of E is a set written ℘(E), it can only determine the interpretation of the functor ℘ relatively to (depending on) the universe where its open quantifiers are interpreted: «all subsets of E existing in this universe are in ℘(E)». In the other interpretation of what it means for a class to be a set, this would mean that the universe already contains "really all" subsets of E, forming the supposedly "true" set ℘(E) which will stay fixed in any further expansion of the universe. However, for the powerset of an infinite set, this wish cannot be expressed in first-order logic nor any other conceivable mathematical formalism: there is no way to talk about meta-sets that "cannot be found" in this universe but would «exist elsewhere» (in any mysterious bigger universe) to exclude them from there. (Only one aspect of this idea can be formalized as the axiom schema of specification).

4.8. More recursion tools

Rebuilding recursion

Let us work in a first-order theory just assumed to have 3 types ℕ, E and HE, the language and axioms of arithmetic for ℕ with the schema of induction over expressible subsets of ℕ, and the axiom that H provides "all finite sequences"

n∈ℕ, ∀uH, ∀yE, ∃vH, vn=y ∧ ∀k<n, vk = uk

The following existential result has a simple proof by induction making no use of uniqueness:

R⊂ℕ×E2, ∀n∈ℕ, (∀i<n, ∀xE, ∃yE, R(i,x,y)) ⇒ ∀aE, ∃uH, u0=a ∧ ∀i<n, R(i,ui,ui+1).

Its simplicity only goes for this case, only generalizable to conditions of the form R(n,(uk)k<n,un), while the case of algebraic terms is also true but more difficult to prove.
As a particular case, comes the finite choice theorem written as

R⊂ℕ×E, ∀n∈ℕ, (∀i<n, ∃yE, R(i,y)) ⇒ ∃uH, ∀i<n, R(i,ui).

With fEℕ×E the restriction of such u to numbers ≤ n is also unique by induction, from which the xE such that x0=a ∧ ∀n∈ℕ, xn+1 = f(n,xn), can be defined by its graph

{(n,un) | (n,u)∈ℕ×H, u0=a ∧ ∀i<n, ui+1 = f(i,ui)}

As H contains any finite sequence expressible in the theory, this definition of recursion turns out to be "the recursion which the theory can express" independently of the particular choice of H.
The construction of ℕ(ℕ) from the last section, and similarly simple candidate expressions of H as a countable set, assumes recursion, thus cannot be used to define recursion by the above method. To actually provide a definition of recursion that does not assume it, requires an independent expression of an enumerated H, which is harder but finally possible. Such a construction was achieved by Godel as part of his work to prove the incompleteness theorem.

Another proof of recursion

By seeing morphisms as subalgebras, we can write another construction of recursive sequences u∈Mor(0,S)(ℕ,(E,a,f)), as follows.

Let M be the minimal subalgebra of ℕ×Ea,f, and let A={n∈ℕ | ∃!xE, (n,x)∈M}.
As M is a minimal (0,S)-algebra, M = {(0,a)}∪ Im SM.
Substituting this into the definition of A we get
p∈ℕ, pA ⇔ (∃!yE, (p=0 ∧ y=a)∨∃(n,x)∈M, (p=Sny=f(x))).
From 0 ∉ Im S we get 0∈A, and
n∈ℕ, SnA ⇔ ∃!yE, ∃(n',x)∈M, (Sn=Sn'y=f(x)).
From the injectivity of S we get
n∈ℕ, SnA ⇔ ∃!yE, ∃xE, (n,x)∈My=f(x).
Thus (∀nA, SnA), so that A = ℕ, i.e. M is the graph of a function uE. As M ∈ Sub(ℕ×Ea,f), we conclude u ∈ Mor(0,S)(ℕ,Ea,f).

Injectivity lemma

If E is a surjective partial L-algebra and F is an injective L-system then ∀f ∈MorL(E,F),
  1. A = {xE | ∀yE, f(x) = f(y) ⇒ x=y} ∈ SubLE.
  2. B = {yF | !: f(y)} ∈ SubLF
    Thus if more precisely E, F are algebras, {yF | ∃!: f(y)} ∈ SubLF.
While they are almost the same, let us write separate proofs.
The condition f ∈MorL(E,F) is read Im f ⊂ Dom ψFLf|Dom φE = ψFf⚬φE.
  1. xLA⋂Dom φE, ∀yE, ∃z∈φE(y),
    fE(x)) = f(y) ⇒ Lf(x) = ψF(fE(x))) = ψF(f(y)) = ψF(fE(z))) = Lf(z)
    x = z ⇒ φE(x) = y.
  2. yF*(LB), ψF(y) ∈ LB ∴ (!z∈Dom φE, ψF(y) = Lf(z) = ψF(fE(z))))
    ∴ !x∈Im φE=E, y = f(x).∎

A more general form of recursion

Some useful sequences need recursive definitions where the term defining uSn uses not only un but also n itself. Somehow it would work all the same, but trying to directly adapt to this case the proof we gave would require to define some special generalizations of previous concepts, and specify their resulting properties. To simplify, let us proceed another way, similar to the argument in Halmos's Naive Set Theory, but generalized.
For any algebraic language L, let us introduce a general concept of "recursive condition" for functions u : EF, where, instead of a draft, E is first assumed to be an L-algebra (then a ground term algebra to conclude).
The version we saw was formalized by giving the term in the recursive definition as an L-algebra structure on F, φF: LFF, then expressing the request for u to satisfy this condition as u∈Mor(E,F), namely

∀(s,x)∈LE, u(sE(x)) = φF(s,ux).

Let us generalize this as u(sE(x)) = φ(s,x,ux) which by the canonical bijection Dom φ ≡ ∐sL Ens ×Fns ≡ ∐sL (E×F)ns = L⋆(E×F) can be written using h : L⋆(E×F) → F such that ∀(s,x,y)∈ Dom φ, φ(s,x,y) = h(s,x×y), as

u(sE(x)) = h(s,x×(ux)).

As ∀uFE, x×(ux) = (IdE×u)০x, this becomes the second component of the formula IdE×u ∈ Mor(E, E×F) when giving E×F the structure φE×F = (φE০πLh.
The first component (φE০πL) we give to φE×F, makes π∈ Mor(E×F, E) and makes tautological the first component of the formula IdE×u ∈ Mor(E, E×F), namely
IdE(sE(x)) = φE(s,x) = (φE০πL)(s,x×(ux)).
It is then possible to conclude by re-using the previous result of existence of interpretations:
If E is a closed term L-algebra then ∃!f ∈ Mor(E, E×F), which is of the form IdE×u because π০f ∈ Mor(E, E) ∴ π০f = IdE.
But one can do without it, based on the following property of this L-algebra E×F:

uFE, IdE×u ∈ MorL(E, E×F) ⇔ Gr u ∈ SubL(E×F)

Indeed the defining formulas of both sides coincide. To see it otherwise, This reduces the issue to the search of subalgebras of E×F which are graphs of functions from E to F.
Now if E is a ground term L-algebra then M = MinL(E×F) is one of them because π|M∈ MorL(M, E) from a surjective algebra to a ground term algebra must be bijective.
Any other subalgebra of E×F must include M, thus to stay functional it must equal M. ∎

4.9. Non-standard models of Arithmetic

Standard and non-standard numbers

Since the induction axiom needed to determine the isomorphism class of ℕ depends on ℘(ℕ), the relativity of ℘ allows a diversity of non-isomorphic models of arithmetic respectively defined by the diverse models of any consistent first-order theory T which contains arithmetic. Many theories may be picked in the role of T, such as a set theory seen as a first-order theory, thus involving other notions than numbers (to define more sets of numbers than arithmetically defined classes). All we shall assume here is that it has a notion N of "natural number" with a constant 0, a function symbol S, and axioms making N a model of bare arithmetic (in any model of T), thus a bijective {0,S}-algebra.

A number in N is called standard if it belongs to the minimal subalgebra N0 = Min{0,S}N, isomorphic to ℕ by the unique {0,S}-morphism (embedding, nSNn(0N)) from ℕ to N. So, it represents an element of ℕ, which may be called a meta-number.
A model N of arithmetic is called standard if it is a minimal {0,S}-algebra, which can be equivalently written in 3 ways:

We shall abbreviate some works of first-order logic by set theoretical notations as follows :

Existence of non-standard models

The non-standardness of N is expressible by axioms on an additional structure, forming an extended theory T'. One way is to add If T is consistent then T' is also consistent (and cannot prove more), as any finite list of axioms of T' can be satisfied in any model of T (by interpreting k as a different number). Thus, T' also has a model, which is a model of T where N is non-standard.

This means that no consistent extension of arithmetic can ever force its model of arithmetic to be standard, i.e. require induction to apply to the full powerset of N, which it cannot ensure to exhaust by its classes. Candidate "standardness" predicates may be introduced but still cannot be forced to coincide with the true standardness (N0), which may stay a meta-set beyond all classes.

Any model of arithmetic obtained as described in the proof of the completeness theorem from any consistent extension of arithmetic, will necessarily be non-standard, as it cannot even be elementarily equivalent to ℕ according to the truth undefinability theorem. Skolem's paradox still holds in two ways: Non-standard models behave much like standard ones, as they satisfy all consequences of induction. Their precise range of diversity depends on the chosen theory of arithmetic.

Non-standard models of bare arithmetic

The only "addition" definable from S, is the meta-operation adding any n∈ℕ to any xN as x + n = Sn(x), that is the meta-sequence of functions (Sn)n∈ℕ each defined in the theory by a unary S-term n. The consequences of the induction schema in bare arithmetic can be summed up as As the restriction of S to the meta-set N\N0 of non-standard numbers of any model is a permutation, N\N0 is a -set. The last formula obliges this ℤ-set to be free. Conversely, the disjoint union of ℕ with any free ℤ-set, forms a model of bare arithmetic.
Applying both ways the definition of the order from the partial meta-operation of addition, gives two ordering results with non-standard numbers: Models of arithmetic with order, are formed as models of bare arithmetic with any choice of a total order on the partition of non-standard numbers in ℤ-orbits. Thus, bare arithmetic cannot suffice to define the order. But arithmetic with order cannot suffice to define addition either, as its non-standard models may either admit no or many corresponding interpretations of addition.

Non-standard models of Presburger Arithmetic

Such models satisfy all theorems of Presburger Arithmetic. In particular they have a well-defined total order, by which any non-standard number is greater than any standard number, and any non-empty class of numbers has a smallest element (which the meta-set of non-standard numbers hasn't).

Here are some more details only for the sake of illustration, which may be skipped.

There is also a meta-operation (sequence of functions) of multiplication of an xN by a standard number n∈ℕ : nx = x+...+x (with n occurrences of x)
Beyond commutativity, associativity and the seen properties of the order, the last independent consequence of the axiom schema of induction constraining non-standard models, is the possibility of Euclidean division by any nonzero standard number (generalizing the results on parity) :

d∈ℕ*, ∀xN, ∃qN, ∃r<d, x = dq + r

thanks to ∀d∈ℕ*, ∀qN, d⋅(q+1) = dq+d which is a schema of theorems in Presburger arithmetic.
Moreover this (q,r) is unique; q = x:dN is called the quotient and r is called the rest of the division of an xN by a d∈ℕ*, so that

qN, q=x:ddqx< d⋅(q+1) ⇔ ∃ r<d, x = dq + r.

The concept of model of Preburger arithmetic generated by a set (of non-standard numbers, since standard ones have no generating effect), is defined as that of subalgebra generated by a subset for the "algebra" (not exactly an algebra, but...) with language completed, like in the proof of the completeness theorem, by additional operation symbols reflecting all existence properties deduced from the axiom schema of induction:

They can be conceived either in the abstract (by evaluating relations arbitrarily, like in the proof of the completeness theorem) or as a subset of a given model (interpreting expressions there).
In particular, in any model of Preburger arithmetic generated by a single element (non-standard kN), the set of non-standard elements is the set of all values of expressions of the form
(ak):d +b
where a,d ∈ℕ* and b∈ℤ (the cases where a,d are relative primes suffice).
The predicate of divisibility of xN by a d∈ℕ*, is defined as the case when the rest cancels: d|x ⇔ (∃qN, x = dq)

The possible shapes of these models with respect to k (classes of isomorphisms preserving k, described using it), are classified by the sequence (rd)d∈ℕ* of rests of the division of k by all standard numbers d. The possible sequences are those which satisfy not only rd<d but also the compatibility formulas : n,d ∈ℕ*, ∃h, rdn = dh + rd (where in fact h<n). The simplest one is where all rd are 0, i.e. where k is divisible by every standard number (but the distinguishing property of this isomorphism class of models with 1 generator, "there exists a number divisible by every standard number", is inexpressible in Presburger arithmetic).

The non-standard models generated by 2 non-standard numbers k,k' can be split into the following classification depending on what may be intuitively described as the (standard) real number which k/k' is infinitely close to:

4.10. Developing theories : definitions

Given a theory T (with types, structures and axioms), a development of T is a extended theory T' = T∪(pack of extra components) which remains «equivalent» to T.  This concept has 3 aspects:
There are 3 levels of development with their respective schemes:

The use of a theorem as an available axiom for the proofs of the next theorems, abbreviates a repetition of its proof as a part of the next proofs (with possible substitutions of variables by terms to which the theorem is applied).
When inserting theories in the set theoretical framework, constructions become set theoretical definitions, which is why the distinction between definitions and constructions is not usually clarified in the literature.

The Galois connection (Mod,Tru)

For any language L, the truth relation between ground formulas and systems defines a Galois connection (Mod,Tru), where Mod gives the class of models of any set A of ground formulas taken as axioms, while Tru gives the set of true formulas which all given systems have in common. Replacing the class of all systems by the set of all L-structures on ℕ, leaves unchanged and thus gives definite sense to the closure Tru০Mod. All theorems of a theory belong to the closure Tru০Mod(A) of axioms, which is the condition for a formula to leave unchanged Mod A when it is added to A.

Schemes of definitions

We have 3 schemes of definitions:

Extending models by undefined structures

Thanks to the completeness theorem, if every model of T can be extended as a model of a T' = (T∪{R,A}) where R is a new symbol and A is a new (set of) axiom(s), then every ground formula of T provable in T', is also provable in T.
The converse may be wrong: a theory T may be extensible to a theory T' unable to prove more of its formulas, but to which not all its models are extensible. Here are some examples from non-standard models of arithmetic :

Definitions extend models

For any theory T with a model E, extensions T' of T by an additional axiom and/or structure symbol may have different effects: The uniqueness of the extension for every E, means that in the doubly extended theory (T∪{R,A,R1,A1}) where (R1,A1) is a copy of (R,A), one can prove

∀(variables), RR1

Definitions preserve sets of isomorphisms

For any isomorphism f between models E and F of T, with respective extensions as models E' and F' of a theory T' extending T by further structures and axioms, f may stay or not as an isomorphism between E' and F', i.e. preserve the interpretation of the new structure(s). It will be an isomorphism whenever the additional axiom(s) form a definition, determining the new structure(s) and thus making it invariant by isomorphisms. Thus, adding a defined structure to a theory preserves the sets of isomorphisms between its models.
In particular, any relation defined by an expression with language LR without parameters is invariant by automorphisms, so is in its closure Inv(AutL E). This means that adding it to L preserves AutL E.
As we saw how the switch from models to isomorphisms provides similarities between development levels from proofs to definitions, we shall now see it also providing similarities from definitions to constructions, making sense to how constructions also essentially preserve theories.

4.11. Constructions

Diverse construction schemes let new types play the roles of sets defined in set theory by operations between sets named by previous types: Each of the following schemes suffices to produce both others through the use of previous ones: The last one can be obtained by composing constructions of products (to collapse bound variables as one, and the same for parameters), a definition and a type of unary relations or functions (either as particular relations, or by adapting the form of axioms, replacing ⇔ by = ).

A development scheme at each level looks like a component at the next level

Despite their differences of nature, the 3 levels of development have remarkable similarities:

How constructions preserve isomorphisms

From any theory T, consider any extension T' = TXRA where X is a type, R is a pack of structures and A is a pack of axioms. The existence of an extension of a given model of T as a model of T', is a matter of A being small enough for these X and R. Its uniquess would not make direct sense as any identical copy would also fit. But extensions may still be qualified as «essentially unique» (being all copies of each other) giving sense to how constructions preserve models, depending on the set of isomorphisms between them. For any models E' and F' of T' respective extensions of models E and F of T, and any isomorphism f : EF, Indeed in this case, any 2 isomorphisms g,h from E' to F' with the same restriction to E, will be equal because
The automorphism h−1g of E' extended by IdX' forms an automorphism of the model (E'E'') of T''. As j is invariant, it stays equal to its image j০(h−1g)−1 by this automorphism. Therefore h−1g = IdE' , i.e. g=h.
What we can see here strictly preserved by constructions, is the class of isomorphisms as an abstract category. In particular, constructions leave unchanged the automorphism groups seen as abstract groups, providing more types on which they act.