Set Theory and Foundations of Mathematics
2. Set theory
2.1. First axioms of set theory
The inclusion predicate
The binary predicate ⊂ of inclusion
between sets is defined by : for all sets E and F,
E ⊂ F ⇔
∀x∈E, x ∈ F
and read as
"E is included in F", or "E is a subset of F", or "F
includes E". Properties of inclusion between classes apply.
E ⊂ E is logically valid.
Implications chains
also appear as inclusion chains:
(E ⊂ F ⊂ G) ⇔ (E ⊂ F
∧ F ⊂ G) ⇒ E ⊂ G.
We may use the same notation E ⊂ F as abbreviation for the inclusion
of a set E in a class F (and similarly for other formulas defined from this):
E ⊂ F ⇔ ∀x∈E, F(x)
Formulas vs statements
Most set theories (except mainly FST
and some strong versions) will only accept
bounded formulas as
sub-formulas of terms (by the set-builder, and later by the conditional operator) and as
possible definitions of predicates (what predicate symbols abbreviate).
Open quantifiers will be only allowed in statements (declarared true as axioms
or theorems).
In set theory, a statement is a ground formula which can
combine the symbols of first-order logic with the regular ones of set theory as follows:
it must be made of a chain of open quantifiers, usually all ∀ and often written in words
("for all"), followed by a bounded formula. Proofs will naturally use the deduction rules
for open quantifiers (introduction and elimination) by common language articulations.
The above definition of ⊂, basically claiming the predicate symbol ⊂ to abbreviate the
bounded formula (∀x∈E, x ∈ F), can also be seen
as an example of statement with open quantifiers (∀SetE,F, )
taken as an axiom.
The role of axioms
As explained in 1.9,
the role of the axioms of a generic theory, is to express its intended range of models
as a selection (class) from a
wider notion of "all systems" with structures named by the same
language. The logical framework, which holds (describes) this wider notion, can
interpret the axioms in each such system, then exclude those where it finds some
axiom to be false. Excluded systems remain possible models of different
theories described by the same framework. Any statement true in
all systems (a logically valid statement) is useless (redundant) as an axiom.
But expressing set theory in its special logical framework not used for other theories,
leaves a priori undefined the distinction between its logically valid statements, and its
other basic accepted truths which need to be declared as axioms due to their falsity
in some "non-universe". Now this distinction can be given sense by converting set theory
into a generic theory: let us call axioms of a set theory a list of its basic "true"
statements whose translated versions in first-order logic are not logically valid, but
form a proper list of axioms for that generic theory to be equivalent to the intended
version of set theory. From there, provability in a set theory can be defined as the
one given by first-order logic with these axioms.
Converting the binders
When converting set theoretical expressions into first-order logic, the only modified
symbols are the binders,
as their format of use differs between both frameworks. Let us describe the rules of
this conversion.
The function definer
(1.8) becomes an infinity of operator symbols: for each term t with one
argument and any list of parameters, the whole term
(E ∋ x ↦ t(x))
is seen as the big name of a distinct operator symbol, whose arguments are E
and the parameters of t. (Those where every subexpression of t without any
occurrence of x is the only occurrence of a parameter, would suffice to define others).
The same goes for the set-builder, which will
come as a particular case in 2.2.
The conversion of quantifiers comes by expressing
their domains as classes :
(∃x∈E,
A(x)) → (∃x, x ∈ E ∧ A(x))
(∀x∈E, A(x)) → (∀x, x
∈ E ⇒ A(x))
Classification of axioms
Axioms of set theories (sometimes with other primitive components) can be classified as
follows according to their roles, ordered from the more "primitive" (necessary) components,
to the more optional and debatable ones (opening a diversity of acceptable set theories).
- The technical axioms give to the primitive notions and symbols their correct
roles :
- The notions of sets and functions,
symbolized in 1.7, are axiomatized
here in 2.1 : axioms for notions ; axiom of extensionality ; axioms for functions.
- Unique element (2.4) ;
- Axioms from the set generation principle (2.2) ;
- Strengthening axioms, introduced in 1.A ;
- More optional technical axioms will come later:
- Axiom of choice (2.10) might be seen as a further specification
for the powerset ;
- Axiom of foundation (evoked in 2.A and expressed in 5.3).
Axioms for notions
The formalization of primitive notions by class symbols following 1.7, needs to be completed by
the following axioms.
∀x, ¬(Set(x) ∧ Fnc(x)) |
: sets are not functions (though it does not matter) |
∀Fnc f, Set(Dom f) |
: the domain of every function is a set |
(Fc) ∀(t,E), Fnc(E ∋ x ↦ t(x))
|
: any definite (E ∋ x ↦ t(x)) is a function
|
Here and in the below axioms for functions, ∀(t,E) is meant as declaring
an axiom schema by second-order universal elimination over the variable functor
t with E included in the definiteness class of the term defining t
for the given values of parameters (this is the definiteness condition for
(E ∋ x ↦ t(x))). Thus for each
term defining t we have an axiom where ∀(t,E) is replaced by
∀parameters, ∀SetE,
(∀x∈E, dt(x, parameters)) ⇒
Axiom of Extensionality
This axiom lets sets be determined by their role (either as ranges for quantifiers, or
as the classes they define by ∈), saying that any two sets playing the same
role are equal :
∀set E,F, E ⊂ F ⊂ E ⇒ E = F.
Indeed, E ⊂ F ⊂ E means that E and F have the
same elements (∀x, x ∈ E ⇔ x ∈ F),
and for any predicate R, (∀x∈F,
R(x)) ⇔ (∀x∈E, R(x))
and similarly for ∃. Informally, the elements of a set are given in bulk.
Axioms for functions
Functions are also made determined by their role, by the following axiom
(=Fnc) : ∀Fnc f,g,
(Dom f = Dom g ∧ ∀x∈Dom f, f(x)
= g(x)) ⇒ f = g.
The function definers are related with
the function evaluator by an axiom which can be written in either way:
- ∀(t,E), ∀Fnc f, f = (E ∋
x ↦ t(x)) ⇔ (Dom f = E ∧ ∀x∈E,
f(x) = t(x))
- ∀(t,E), Dom(E ∋ x ↦ t(x)) = E
∧ ∀x∈E, (E ∋ y ↦ t(y))(x)
= t(x)
Indeed (Fc) ⇒ (1.⇔(2.∧(=Fnc))).
The proof can be taken as an exercise (click to show)
Assume (Fc) for 2. to be definite.
Abbreviate (Dom f = E ∧ ∀x∈E, f(x) =
t(x)) as [f : E, t], to shorten the above as
- ∀(t,E), ∀Fnc f, f = (E ∋
x ↦ t(x)) ⇔ [f : E, t]
- ∀(t,E), [E ∋ x ↦ t(x) : E, t]
1. ⇔ (2. ∧ 1b.) where
1b. ∀(t,E), ∀Fnc f, [f : E, t] ⇒
f = (E ∋ x ↦ t(x))
Proof of 1b. ⇒ (=Fnc)
∀Fnc g,
applying 1b. to (g, E) where E = Dom g as
∀x∈Dom g, dg(x) :
∀Fnc f, [f : E, g] ⇒
f = (E ∋ x ↦ g(x))
[g : E, g] ∴ g = (E ∋ x ↦ g(x))
∀Fnc f, [f : E, g] ⇒ f = g
Proof of (2.∧(=Fnc)) ⇒ 1b.
∀(t,E), ∀Fnc f, ([f : E,
t] ∧ [E ∋ x ↦ t(x) : E, t]) ⇒
f = (E ∋ x ↦ t(x)).
∎
Note: (Fc) ∧ 1. can also be written
1'. ∀(t,E), ∀f,
f = (E ∋ x ↦ t(x)) ⇔ (Fnc f ∧ Dom f =
E ∧ ∀x∈E, f(x) = t(x)).
2.2. Set generation principle
Formalizing diverse notions in set theory
Set theory will be extended by additional components to represent diverse kinds of
meta-objects (roles played by symbolic systems) directly as objects (which can play
them). This can only apply to meta-objects which are "concrete" i.e. object-like enough,
in a sense to be clarified : as seen in 1.8 and 1.D, not all classes can be sets,
while only the functors whose domains are sets are represented by functions.
Some of those extensions will consist in primitive "definer" symbols (with the relevant
axioms) serving to convert into an existing kind of objects some
meta-objects (roles) of the same kind :
- Diverse "set-like" classes will be converted into sets (2.2) (thus giving the
existence of sets coinciding with these classes);
- Indirectly specified elements will become directly designated (2.4).
Other kinds of meta-objects beyond the normal roles of elements,
sets and functions, will form new notions : those of operation, relation and tuple
will be formalized in 2.3. The new symbols they need are both definers (converting
meta-objects into the new objects) and evaluators (interpreting these objects
back to their roles as meta-objects).
But, like for unary relations (1.8), these new notions with their role-giving symbols,
need not be primitive, as they can be naturally constructed as (represented by)
classes of existing objects (sets or functions), which can offer their tools (expressible features)
to the new objects. So, these extensions of set theory can be obtained by giving
definitions of their definer and evaluator symbols.
As the notions of operation and relation can be represented in several useful ways
by classes of old objects, which do not represent each new object by the same old
object (2.3), functors will be used for translations between the classes playing the
roles of these notions in the different ways (2.6, 2.8).
Sets as domains of bounded quantifiers
Bounded quantifiers give sets their fundamental role as ranges of
bound variables, unknown by the predicate ∈ which only gives them
their role of classes. Technically, the bounded quantifier (∃ ∈ , )
suffices to define the predicate ∈ as
x ∈ E ⇔ (∃y∈E, x = y)
but is not generally definable from it in return by any bounded formula, as
its definition
involves an open quantifier.
Philosophically speaking, the perception of a set as a class (ability to classify
each object as belonging to it or not) does not suffice to provide its full perception
as a set, that is the perspective over all its elements as coexisting. This intuitive view of
a set as a coexistence of its elements can be seen as supported by the above definition
of membership as a form of existence with respect to this set.
This way, Russell's paradox means that mathematical objects cannot all coexist
as a ultimate totality.
Statement of the principle
For any data of a class C and a quantifier Q
defined by given
bounded formulas with the same parameters, if the equivalence between Q
and ∀C, namely∀A, (∀x,
C(x) ⇒ A(x)) ⇔ (Qx, A(x))
is proven by second-order
universal introduction (*) under a given condition on parameters, then
C is a set that can be named by a new operator symbol K added to
the language of set theory, with these parameters as arguments, and the
following axiom which expresses K = C
by double inclusion:
for all values of parameters satisfying the condition,
Set(K) ∧ (∀x∈K, C(x)) ∧
(Qx, x ∈ K).
(*) we shall focus on cases whose proofs do not use any axiom obtained from a second
order axiom by using A in the definition of its variable structure, though I am
not sure if any such use is possible.
This principle will be justified in 2.B.
The equivalence condition between Q and ∀C is equivalently
expressible as the following list of 3 statements, where the quantifier Q*
defined by (Q*x, A(x)) ⇎ (Qx,¬A(x))
will be equivalent to ∃C:
(1) ∀x, C(x) ⇔ Q*y, x=y
(in fact we just need ∀x, C(x) ⇒ Q*y,
x=y)
(2) Qx, C(x)
(3) ∀A,B, (∀x, A(x) ⇒ B(x))
⇒ ((Qx, A(x)) ⇒ (Qx, B(x))).
Proof of equivalence (click to show)
These 3 properties are already known consequences of «Q = ∀C
». Conversely,
((2) ∧ (3)) ⇒ ((∀C x, A(x))
⇒ Qx,A(x))
((1) ∧ ∃C x, A(x)) ⇒ ∃y,
(Q*x, x=y) ∧ (∀x, x=y
⇒ A(x)) ∴ ((3) ⇒ Q*x,A(x))
∎
Main examples
Our first and main use cases of this principle are listed in the following table.
To justify all of these, (1) comes by finding C to be the formula so defined
from Q, while (3) is ensured by the absence of any "negative" occurrence
of A in Q (inside a negation, an equivalence, or left of a ⇒).
This leaves (2) as the easy remaining condition to check.
The first column recalls the above generic abbreviations ; each other
column gives an effective example. The K line gives the notations for the
set theoretical symbols so introduced.
K |
{y∈E | B(y)} |
⋃E |
Im f |
∅ |
{y} |
{y,z} |
C(x) |
x∈E ∧ B(x) |
∃y∈E, x∈ y |
∃y∈Dom f, f(y)=x |
0
|
x=y |
x=y ∨ x=z |
dK |
∀x∈E, dB(x) |
Set(E)∧∀y∈E, Set(y)
|
Fnc(f) |
1
|
1
|
1
|
Qx, A(x) |
∀x∈E, B(x)⇒A(x) |
∀y∈E, ∀x∈y, A(x) |
∀x∈Dom f, A(f(x))
|
1 |
A(y) |
A(y) ∧ A(z) |
Q*x,A(x) |
∃x∈E, B(x) ∧ A(x) |
∃y∈E, ∃x∈y, A(x) |
∃x∈Dom f, A(f(x)) |
0
|
A(y) |
A(y) ∨ A(z) |
The value of the set-builder K={x∈E | B(x)} first defined in 1.8
as a class, thus with an open quantifier (∀x, x∈K ⇔
(x∈E ∧ B(x))) is now re-defined by
axioms without open quantifier beyond parameters:
Set(K) ∧ (∀x∈K,
x∈E ∧ B(x)) ∧ (∀x∈E,
B(x) ⇒ x ∈ K)
or more shortly Set(K) ∧ K ⊂ E ∧ (∀x∈E,
x ∈ K ⇔ B(x)). Like the axioms for functions, this is
an axiom schema by second-order universal elimination on ∀B, so with
all unary predicates B defined by bounded formulas with parameters.
This way, the proof of Russell's paradox would be written
∃F, (F={x∈E | Set(x)
∧ x∉x} ∴ Set(F) ∴ (F∈F
⇎ (Set(F) ∧ F∉F))) ∴ F ∉ E
The functor ⋃ is the union symbol, and its axioms form the axiom
of union.
The set Im f of values of f(x) when x ranges over Dom f,
is called the image of f.
We define the predicate (f : E → F)
that reads «f is a function from E to F », while E and F are sets (definiteness condition which
may also be integrated in this definition), as
(f : E → F) ⇔ (Fnc(f) ∧ Dom f = E
∧ Im f ⊂ F)
A set F such that Im f ⊂ F (i.e. ∀x ∈ Dom f,
f(x) ∈ F), is called a target set of f.
The more precise (Fnc(f) ∧
Dom f = E ∧ Im f
= F) will be denoted f : E ↠ F (f
is a surjection, or surjective function from E
to F, or function from E onto F).
The empty set ∅ is the only set without element, and is
included in any set (∀SetE, ∅ ⊂ E).
Thus, (E=∅ ⇔ E ⊂ ∅ ⇔ ∀x∈E, 0), and
thus (E ≠ ∅ ⇔ ∃x∈E, 1).
The presence of this constant symbol ∅ ensures the existence of a set; for any set
E we also get ∅ = {x∈E | 0}. We have ⋃∅ =∅.
As (Dom f = ∅ ⇔ Im f = ∅) and (Dom f = Dom g = ∅
⇒ f = g), the only function with domain ∅ is called the empty function ∅↦.
That a variable may have empty range, is no obstacle for fixing it. In particular,
developing an inconsistent theory means studying a fixed system whose range
of possibilities is empty. We may actually need to do so in order to discover a
contradiction there, which means to prove that no such system exists.
We can redefine ∃ from the above in two ways: (∃x∈E,
A(x)) ⇔ {x∈E | A(x)} ≠ ∅
⇔ (1 ∈ Im(E∋x ↦ A(x))).
For all x, {x,x} = {x}. Such a set
with only one element is called a singleton.
For all x and y we have {x, y} = {y,
x}. If x ≠ y, the set {x, y}
with 2 elements x and y is called a pair.
For any set E, the identity function on E is
defined by IdE = (E
∋ x ↦ x).
Thus,
Dom IdE = E = Im IdE
∀x∈E, IdE(x) = x.
2.3. Currying and tuples
Formalizing operations and relations
To formalize in set theory every notion of operation
(resp. relation) with any fixed
arity n>1, acting as n-ary structures between n sets, would involve the
following symbols (generalizing the case n = 1 of functions which we took as primitive):
- A list of n domain functors (sometimes not available but not needed since
domains are usually fixed by the context);
- Its evaluator is an (n+1)-ary operator (resp. predicate) written in filled form as
f(x0,…,xn−1),
with arguments one n-ary operation (resp. relation) f, and its
n arguments x0,…,xn−1;
- Its definer binds n variables to their respective ranges on a term (resp. a formula).
Let us give to the operation definers, the following notation which will
be later (2.7) justified as reflecting their definitions: in the binary case (n = 2), operations
defined by a binary operator A with domains E and F will be written
E×F ∋ (x,y) ↦ A(x,y)
After seeing what operations and relations are (which role they play) let us discuss how
they are reducible to something else.
One aspect is that relations are reducible to operations, and vice-versa:
- The notion of n-ary relation can be constructed as a class of n-ary
operations, by taking the Boolean pair as target set, as we did in 1.4 (for predicates) and will
formalize in 2.4.
- Inversely, the notion of n-ary operation can be constructed as a
class of (n+1)-ary relations (2.7).
These constructions can be useful despite their circularity (they are not decreasing
the arity). But for foundational purposes, we need a construction which reduces to lower arity
cases, down to sets and functions. There are actually 2 such constructions.
Currying
The simplest constructions of the notions of n-ary operation from our primitive tools for functions,
are their curried representations, defined as classes of functions by expressing the operation definer
as n successive uses of the function definer (one for each bound variable); and thus the
evaluator as n uses of the function evaluator. For n=2, this means a
binary operation in curried form b with domains E and F, is defined from a binary
operator B, and evaluated back as B, by the formulas
b = (E ∋ x ↦ (F ∋ y ↦ B(x,y)))
Dom b = E ∧ ∀x∈E, Dom b(x) = F
∧ ∀y∈F, B(x,y) = b(x)(y)
whose equivalence, similar to the axiom
1. for functions, is deducible from it.
The first domain functor comes
as Dom b = E, but the second one, given as common value of all
Dom b(x), is only definite for E ≠ ∅.
The intermediate function b(x) = (F ∋ y
↦ B(x,y)) represents the functor defined by
B with parameter x (free) and argument y (bound).
This somehow makes each binary operation work as a function evaluator,
by giving x the role of a function with argument y ; the curried
function evaluator just gives back each function it takes : f ↦
((Dom f) ∋ y ↦ f(y)).
Similarly, the notion of n-ary relation can be defined in curried form as a class,
giving the role of definer to a succession of n−1 uses of the function
definer and 1 use of the set-builder.
The curried form r of a binary relation with domains E and F, is so defined
from a binary predicate R, and evaluated back as R,
by the equivalent formulas
r = (E ∋ x ↦
{y ∈ F | R(x,y)})
Dom r = E ∧ (∀x∈E, r(x) ⊂ F
∧ ∀y∈F, R(x,y) ⇔ y ∈ r(x))
This leaves no way to define second domain functor (F cannot be restored
from r).
These constructions have the defect of breaking the symmetry
of role between arguments. The next construction method, by tuples, will preserve
this symmetry, and will be usually preferred for this reason.
Tuples
For any theory and any natural number n, a notion of n-tuple is a notion
constructed for its variables to serve as abbreviations for a pack of n variables
from previous notions.
A 2-tuple is called an ordered pair, a 3-tuple is a triple, a 4-tuple is a
quadruple and so on. So each tuple x (so constructed object) represents
a meta-function of "interpretation" from this list (meta-set) of n variables, into
the model. The definer and evaluator symbols for tuples take a
special form because, while tuples are meta-functions, the role they play in the
theory differs from those of functors or functions.
An n-tuple definer is not a binder but an n-ary operator, packing its
n variables into one by taking them as its arguments in a
parenthesis and separated by commas: ( ,…, ).
Now, each structure
(resp. operation, relation) can be re-formalized as a unary structure (resp. a function,
a unary relation) with domain a type (resp. a class which is actually a set) of tuples, to be
used with the relevant tuples definer. This can be implicitly done by re-interpreting the notation:
S(x,y) can be read seeing S either as a binary
structure (resp. operation, relation), or as a unary one taking as argument the
ordered pair given by the term (x,y) formed by the ordered pair definer
with arguments x and y.
The role of n-tuple evaluator is played by a list of n functors
called projections which extract each variable from the given pack.
This list plays the role of the function evaluator curried in the reverse way, by fixing its
meta-argument whose range (the domain of tuples as meta-functions) is not seen by
the theory as a set of objects but as a list of n symbols to be taken separately.
The formal details of this construction depend whether we take a generic theory or a
set theory. Tuples axiom
In generic theories, it must be a distinct constructed type for not only each n
but also each choice of the type of each of these n variables (if starting from
a theory with multiple types - this choice of a list of variables with their types amounts
to the choice of a predicate
symbol type), but this directly forms a
legitimate construction (4.11).
For n=2 (other cases are similar), and a type P of ordered pairs
(x,y) where x,y have
types E and F, the projections π0 and π1
are related to the ordered pair definer, by the axiom
∀E x, ∀F y,
∀P z, (x = π0(z) ∧ y =
π1(z)) ⇔ (x,y) = z
which implies
∀P z, (π0(z),
π1(z)) = z
∀E x,
∀F y, π0(x,y) = x
∀E x, ∀F y,
π1(x,y) = y
∀E x,x', ∀F y,y',
(x,y) = (x',y') ⇔ (x = x' ∧ y = y')
Tuples in set theory
Set theory having no types only needs one notion of n-tuple for each n.
But it faces one difficulty : in order to rigorously justify that all tools of set theory remain
applicable to tuples in the same way as to the rest of elements, tuples need to be
integrated among elements. Namely, set theory needs to develop its notions of tuples
by defining them as classes of already existing elements, instead of simply accepting
them by the above constructions, which would leave them as separate types.
For this, since tuples are meta-functions, they are naturally represented by functions.
This involves copying their meta-domain, which is a list of symbols, into a set (of
elements). Namely, let us represent the domain of considered n-tuples by the set
Vn of n numbers from 0 to n−1 all named by constant
symbols we shall simply conceive as digits. To justify this as a development
of set theory, these digits can be defined by any ground terms with distinct
values, such as ∅, {∅}, {∅, {∅}} and endlessly many more sets and
functions expressed from there. The notion of n-tuple is then defined
as the class of all functions with domain Vn.
Then, the projections are defined by the function evaluator applied to the respective
digits, which can be read as disguised in their notations : for all n-tuples x,
∀i∈Vn, πi(x) =
x(i).
This index notation, making a pun of the equivalence between reading
the index i as an argument or a meta-argument that means to take the i-th
symbol from a list (here πi is the i-th projection), is also commonly
used for tuples : xi can be either read as the i-th variable in
the tuple x = (x0,…,xn−1), or as a third
notation for the same evaluator after x(i) and πi(x).
More tools need to be introduced before defining in set theory the
tuple definers (2.4) and the operation definers working with tuples (2.7).
2.4. Uniqueness quantifiers and conditional operator
Uniqueness quantifiers
For all sets F ⊂ E, all unary predicate A
definite on E, and all x ∈ E,
x ∈ F |
⇔ {x} ⊂ F ⇔ (∃y∈E, x=y
∧ y∈F) ⇔ (∀y∈E, x=y ⇒ y∈F) |
x ∈ F |
⇒ ((∀y∈F, A(y)) ⇒ A(x) ⇒
∃y∈F, A(y)) |
F ⊂ {x} |
⇔ (∀y∈F, x=y)
⇒ ((∃y∈F, A(y)) ⇒ A(x) ⇒
(∀y∈F, A(y))) |
F={x} |
⇔ (x∈F ∧ ∀y∈F,
x=y) ⇔ (∀y∈E, y∈F ⇔ x=y) |
Here are 3 new quantifiers: ∃≥2 (plurality), ! (uniqueness), and ∃!
(existence and uniqueness), whose results when applied to A
in E only depend on F={x∈E | A(x)}
(like ∃ and unlike ∀) :
(∃x∈E,
A(x)) ⇔ (F ≠ ∅) |
⇔ (∃x∈F,
1) ⇔ (∃x∈E, {x} ⊂ F) |
(∃≥2x∈E,
A(x)) ⇔ (∃≥2: F) |
⇔ (∃x,y∈F,
x ≠ y) ⇔ (∃x≠y∈E, A(x) ∧
A(y)) |
(!x∈E,
A(x)) ⇔ (!:F) |
⇔ ¬(∃≥2: F) ⇔
(∀x,y∈F, x=y) ⇔ ∀x∈F,
F ⊂ {x} |
(∃!x∈E,
A(x)) ⇔ (∃!:F) |
⇔ (∃x∈F, F⊂{x})
⇔ (∃x∈E, F={x}) |
F ⊂ {x} |
⇒ (∀y∈F, F⊂{y}) ⇔ (!:F) |
(∃!:F) |
⇔ (F≠∅ ∧ !: F) |
F≠∅ |
⇒ ((∀y∈F, B(y))
⇒ (∃y∈F, B(y))) |
( !: F) |
⇒ ((∃y∈F, B(y))
⇒ (∀y∈F, B(y))) |
F={x} |
⇒ ((∃y∈F, B(y))
⇔ B(x) ⇔ ∀y∈F, B(y)) |
Single element axiom
Let us introduce a new primitive symbol ℩ to serve as inverse of the singleton functor { } : only
definite on the class of singletons d℩E ⇔ (Set(E) ∧ ∃!:E),
it gives their unique element, as defined by the single element axiom
which can be equivalently written
∀x, ℩{x} = x
∀SetE,
∃!:E ⇒ ℩E ∈ E
Then for every unary predicate A and every singleton E,
A(℩E) ⇔ (∃x∈E,
A(x)) ⇔ (∀x∈E, A(x)).
Conditional connective
Following conventions from computer science, let us denote the ternary
connective «If A then B otherwise C» as (A ? B : C) ⇔
(¬C⇒A⇒B) ⇔ ((A⇒B)∧(¬A⇒C))
⇔ (¬A ? C : B)
⇔ ((A∧B)∨(¬A∧C)) ⇔((C⇒A)⇒(A∧B))
⇎ (A ? ¬B : ¬C)
Any n+1-ary connective K amounts to two n-ary
ones: K(A) ⇔ (A ? K(1): K(0)).
Thus, ¬A ⇔ (A ? 0 : 1)
(A⇒B) ⇔ (A
? B : 1)
(A∧B) ⇔ (A ? B : 0)
(A∨B) ⇔ (A ? 1: B)
(A ⇔ B) ⇔ (A ? B : ¬B).
Conditional operator
Like the conditional connective, it chooses between two objects x,y
depending on the boolean B:
(B ? x : y) = ℩{z∈{x,y}| B
? z=x : z=y} = ℩{z∈{x,y}| z=x
⇔ B}
so that for any unary predicate A,A(B ? x :
y) ⇔ (B ? A(x) : A(y)).
All para-operators other
than connectives but with at least a Boolean argument, are naturally
expressed as composed of the conditional operator with operators, or
the conditional connective with predicates, which is why they did
not need an explicit presence in the language of a theory.
Relations as operations
For a given n, once given the notion of n-ary operation with its
definer and its evaluator denoted as in 2.3, the notion of n-ary
relation can be defined as a role of the class of n-ary operations
with values in V2 = {0,1} (serving as Boolean type),
as follows. Booleans B are converted to and from objects
b∈V2, by b = (B ? 1 : 0)
and B ⇔ b = 1.
So with n=2 (other cases are similar), such a binary operation r with
domains E and F gets this role of binary relation, being defined from
any binary predicate R, and evaluated back as R, by the formulas
r = (E×F ∋ (x,y) ↦
(R(x,y) ? 1 : 0))
∀x∈E, ∀y∈F,
R(x,y) ⇔ r(x, y) = 1
Defining the tuples definers
The ordered pair definer can be defined as a curried view on the conditional operator:
(x,y) = (V2 ∋ a ↦
(a = 1 ? y : x))
Confusing Booleans with objects,
(B ? x : y) = πB(y,x).
Other tuples can be similarly defined in diverse ways, such as
(x,y,z) = (V3 ∋
a ↦ (a = 2 ? z : (a = 1 ? y : x)))
Given the notion of ordered pairs, the roles of notions of tuples of any
higher arity n > 2 can also be played by other classes, with other definitions
of their definers and projections, satisfying the same axioms.
For example, triples t = (x,y,z)
can be defined as t = ((x,y),z)) and
evaluated by
x = π0(π0(t))
y = π1(π0(t))
z = π1(t)
2.5. Families and Boolean operators on sets
Families
The notion of family is formally synonymous with that of function, but meant to
play roles intuitively similar to those of tuples : while the domains of tuples are finite and
made of symbols or numbers, those of families can be more general but still seen
as intuitively "simple" sets, fixed in each context of use, and outside the "main studied system"
containing the target set.
A «family of...» is a family whose image is a «set of... » (included in the said notion).
Families use the formalism of functions disguised in the style of the symbols for
tuples (themselves inapplicable due to their finiteness for infinite domains). The
evaluator for a family u at i, instead of u(i), is written
πi(u) or ui (looking like a meta-variable
symbol of variable).
The family definer on a term t is written (t(i))i∈I
or sometimes with ellipsis as (t(0),…,t(n−1)) abbreviating the n-tuple
definer, instead of I ∋ i ↦ t(i). The argument i is called
index, and the family is said indexed by its domain I. We may write
u = (ui)i∈I as a shorthand for
Fnc u ∧ Dom u = I.
When formalizing model theory in set theory, the "lists" of components forming the
contents of theories are basically
families. More precisely:
- The list of abstract types is a set of urelements, serving as indexing
set (i.e. domain) for the family of interpreted types
- Both lists of predicate symbols and
operator symbols, are families of tuples of abstract types
- Of the list of axioms, only its image matters, which is a subset of the set of all
statements.
Structures and binders
Seeing tuples as particular functions, explains that structures (unary structures on classes of tuples)
are particular binders over terms (unary structures on classes of functions defined by
these terms:
1.8).
Similarly, logical connectives
are particular quantifiers.
In particular, ∀ and ∃ respectively generalize the chains of conjunctions and of disjunctions:
(B0∧…∧Bn−1) ⇔
(∀i∈Vn, Bi)
(B0∨…∨Bn−1) ⇔
(∃i∈Vn, Bi)
The distributivity of connectives (1.6) also works for quantifiers as follows. For any
unary predicate R definite on E and any Boolean C,
(C ∧ ∃x∈E, R(x))
⇔ (∃x∈E, C ∧ R(x))
(C ∨ ∀x∈E, R(x))
⇔ (∀x∈E, C ∨ R(x))
(C ⇒ ∀x∈E, R(x)) ⇔ (∀x∈E,
C ⇒ R(x))
((∃x∈E, R(x)) ⇒ C) ⇔
(∀x∈E, R(x) ⇒ C)
(∃x∈E, C) ⇔ (C ∧ E≠∅) ⇒
C ⇒ (C ∨ E=∅) ⇔ (∀x∈E, C)
Extensional definitions of sets
The operators of empty set ∅, singleton {a} and pairing {a,b}
(2.2) are the first cases
(those of arities 0,1,2) of operator symbols of exensional definition of sets
(listing their elements) which amount to applying the Im functor to tuples :
{a,b,…} = Im(a,b,…).
The definition of Vn from 2.3 can be written
{0,…,n−1}. Images of tuples are finite sets.
Similar notations are used for the binder also given by Im:
{T(x)| x∈E} =
{T(x)}x∈E = Im(E ∋ x
↦ T(x))
As this notation looks similar to the set builder, both may be combined into a third notation :
{T(x)| x∈E ∧ R(x)}
= {T(x)| x∈{y∈E|R(y)}}
A function f is said constant when !: Im f. The constancy of a tuple is the
chain of equalities:
x=y=z ⇔ !:{x,y,z}
⇔ ((x=y)∧(y=z)) ⇒ x=z
Union of a family of sets
The binder of union of any family of sets
(F = (Fi)i∈I ∧ ∀i∈I,
Set Fi), is defined from the previous functor of union of a set of sets (2.2) :
⋃ F =
⋃
i∈I
Fi = ⋃{Fi}i∈I
= ⋃ Im F
∀x, x ∈ ⋃
i∈I
Fi ⇔ (∃X∈ Im F, x∈X) ⇔
∃i∈I, x∈Fi
X ∪
Y = ⋃{
X,
Y}
∀
x,
x ∈
X∪
Y ⇔ (
x∈
X ∨
x∈
Y)
X ∪
Y ∪
Z = ⋃{
X,
Y,
Z}
= (
X ∪
Y) ∪
Z
∀fnc f, Im f =
⋃x∈Dom f {f(x)}
(∀x∈⋃
i∈I
Fi, A(x))
⇔ ∀i∈I, ∀x∈Fi, A(x)
(∃x∈⋃
i∈I
Fi, A(x)) ⇔ ∃i∈I,
∃x∈Fi, A(x)
∀Set E, ⋃
i∈I
Fi ⊂ E ⇔ ∀i∈I, Fi ⊂ E
In return, the union of a set of sets is expressible from that of a family : ⋃ E =
⋃ IdE because Im IdE = E.
Other Boolean operators on sets
A binary predicate with domains a set I and a class C, can be seen in either curried
way, as a meta-family (Ai)i∈I
of unary predicates definite in C, or as a family of Boolean variables
(Ai(x))i∈I depending on a common parameter
x with range C. This way, any quantifier Q with range I (thus, for finite
I, any connective) defines a meta-binder (resp. a meta-operation) on this meta-family (resp. this
meta-tuple) of unary predicates (sub-classes of C), with result a unary predicate R,
defined as ∀C x, R(x) ⇔ Qi, Ai(x).
When C is a set E, this defines a genuine binder (resp. operation) acting in
the class of subsets of E. However, the result depends a priori not only on the given family of sets but
also on the chosen set E. This parameter may be kept implicit as fixed in some contexts, but
in others it needs to be explicitly added as argument.
The functor so defined by the negation connective (¬) is the complement ∁E
(with set parameter E) : ∁E F is called the
complement of F in E. It is synonymous with the following binary operator
\ of difference between any two sets, except that ∁E F is considered only
definite if F⊂E :
∀Set E,F, E\F = {x∈E |
x ∉ F}
∀Set E,F, F⊂E ⇒ ∁E F = E\F ⊂ E
∴ (∀x∈E, x ∈ ∁E F ⇔ ¬x∈F)
Let us analyze the issue of the dependence on E in the general case.
Any family of sets F = (Fi)i∈I
is a family of subsets of its union U = ⋃i∈I
Fi and of any larger set E ⊃ U. In any of these, the existential
quantifier (Q = ∃) defines the union binder:
(∀i∈I, Fi ⊂ E) ⇒ ⋃
i∈I
Fi = {x∈E| ∃i∈I, x∈Fi}
The binder defined by any Q gives a set X = {x∈E| R(x)}
where R(x) abbreviates (Qi, x ∈ Fi).
For this to define a binder (resp. operation) on the class of all sets, this X must be
independent of the chosen E ⊃ U, and then it can be written X =
{x∈U| R(x)}.
Noting that
∀x, x ∈ U ∨ (R(x) ⇔ Qi, 0)
the needed independence condition appears to be ¬Qi, 0 which also implies that
X coincides with the class R.
The difference operator can be so defined by the connective (A∧¬B)
which satisfies this condition ¬(0∧¬0):
∀Set E,F, ∀x, x∈ E\F ⇔ (x∈E
∧ x∉F).
Intersection
The quantifier ∀ defines the intersection of any family of subsets Fi
of a fixed set E:
⋂
i∈I
Fi = {x∈E
| ∀i∈I, x ∈ Fi} =
∁E ⋃
i∈I
∁E Fi
The condition of independence from E is that I ≠ ∅. As above defined, the intersection
of the empty family gives E. The intersection of any non-empty family of sets (I ≠ ∅)
can be written
∀j∈I, ⋂
i∈I Fi =
{x∈Fj | ∀i∈I,
x∈Fi}
∀x, x ∈ ⋂
i∈I Fi ⇔ ∀i∈I, x
∈ Fi
∀Set G, G ⊂
⋂
i∈I
Fi ⇔ ∀i∈I, G ⊂ Fi
∀Set E,F, E∩F = {x∈E | x∈F}
= F∩E
∀x, x ∈ E∩F ⇔ (x∈E ∧ x∈F)
E∩F ⊂ E ⊂ E∪F
E = E∪F ⇔ F ⊂ E ⇔ F = E∩F
Union and intersection have the same associativity and distributivity properties as ∧ and ∨ :
E∩F∩G = (E∩F)∩G
= E∩(F∩G) = ⋂(E,F,G)
E ∩
|
⋃
i∈I |
Fi =
|
⋃
i∈I |
E ∩ Fi |
E ∪
|
⋂
i∈I |
Fi
= |
⋂
i∈I |
E ∪ Fi |
E∩(F∪G)
= |
(E∩F)∪(E∩G)
|
E∪(F∩G) = |
(E∪F)∩(E∪G)
|
Finally the connective ⇎ gives the symmetric difference: E ∆ F
= (E∪F) \ (E∩F).
2.6. Graphs
A graph is a set of ordered pairs. Let us denote the class of graphs as gr:
gr R ⇔ (Set R ∧ ∀x∈R, Fnc x ∧
Dom x = V2).
For any binder Q and any graph G, the formula Qz∈G,
A(z0,z1)
that binds the variable z = (z0, z1)
on a binary structure A definite on G, can be seen
as binding 2 variables z0, z1
on A(z0, z1), and thus
be denoted with a pair of variables:
Q(x,y)∈G, A(x,y).
The transpose of an ordered pair is t(x,y)
= (y,x)
The transpose of a graph R is the image of transposition over it: tR =
{(y,x)|(x,y) ∈ R}
We define the domain and the image of a graph as the respective images
of π0 and π1 over it:
Dom R = {x|(x,y) ∈ R}
Im R = {y|(x,y) ∈ R} = Dom tR
Currying notation
Graphs R can be expressed in curried forms as functors R⃗
and R⃖:
∀x, R⃗(x) = {y
| (z,y)∈R ∧ z=x}.
∀y, R⃖(y) = {x | (x,z)∈R ∧
z=y} = tR⃗ (y).
∀x,y,
y ∈ R⃗ (x) ⇔ (x,y)
∈ R ⇔ x ∈ R⃖
(y)
∀x, x ∈ Dom R ⇔ R⃗
(x) ≠ ∅
Dom R ⊂ E ⇒ Im R = ⋃x∈E R⃗(x) ∧
∀y, R⃖(y) = {x∈E | (x,y) ∈ R}
They can also appear as functions R⃗E =
(E∋x ↦ R⃗(x))
R⃖F = (F∋y ↦
R⃖(y))
(R⃗) = R⃗Dom R
(R⃖) = R⃖Im R.
Functional graphs
The graph of a function f is defined byGr f = {(x,f(x)) | x ∈ Dom f}
∀x,y, (x,y) ∈ Gr f ⇔ (x ∈ Dom f ∧ y =
f(x))
Dom f = Dom Gr f
Im f = Im Gr f
For any function f and any graph R,
Gr f
⊂ R |
⇔ ∀x∈Dom f, f(x) ∈ R⃗(x) |
R ⊂ Gr f
|
⇔ (Dom R ⊂ Dom f ∧ ∀(x,y)∈R,
y = f(x)) |
R = Gr f
|
⇔ (Dom R ⊂ Dom f ∧ ∀x∈Dom f,
R⃗(x) = {f(x)} ) |
A graph R is functional if it is the graph of a function.
This condition is equivalently
written in either way ∀x∈ Dom R, !:
R⃗(x)
∀x,y∈R, x0 = y0 ⇒
x1 = y1.
It is then the graph of the unique function ℩R⃗ = ((Dom R)
∋ x ↦ ℩(R⃗(x))).
For any set E we shall denote 𝛿E = Gr IdE.
Indexed partitions
Two sets E and F are called disjoint when E∩F = ∅,
or equivalently ∀x∈E, x∉F.
A family of sets (Ai)i∈I
is called pairwise disjoint when ∀i≠j∈I,
Ai∩Aj = ∅
For any graph R with Im R ⊂ F, the family
(R⃖(y))y∈F
is pairwise disjoint if and only if R is functional:
(∀y≠z∈F, ¬∃x∈R⃖(y),
x ∈ R⃖(z)) ⇔
(∀(x,y)∈R, ∀z∈F, xRz ⇒ y = z)
For any function f and any y we define the fiber of y under f
as
f•(y) = {x∈Dom f | f(x)
= y} = (Gr⃖ f)(y)
When f : E → F this defines a
family f•F = (f•(y))y∈F of pairwise disjoint subsets of E:
∀y,z∈F, f•(y) ∩ f•(z)
≠ ∅ ⇒ ∃x∈ f•(y) ∩ f•(z),
y = f(x) = z
⋃y∈F f•(y) = E
Im f = {y∈F | f•(y) ≠ ∅}.
An indexed partition of a set E is a family of
nonempty, pairwise disjoint subsets of E, whose union is E.
In other words it is any family of the form f• =
f•Im f for any function f with domain E.
Sum or disjoint union
The binder ∐ of sum of any family of sets (Ei)i∈I,
gives a graph
∐i∈I Ei defined as
∐
i∈I
Ei = ⋃
i∈I {(i,x) | x∈Ei}
∀i,x, (i,x) ∈ ∐
i∈I
Ei ⇔ (i∈I ∧ x∈Ei)
(∀(i,x)∈∐
i∈I Ei,
A(i,x)) ⇔ (∀i∈I,
∀x∈Ei, A(i,x))
(∀i∈I, Ei ⊂ E′i)
⇔ ∐i∈I Ei ⊂
∐i∈I E′i
E0⊔…⊔En−1 =
∐i∈Vn Ei
This binder serves as inverse of currying : R =
∐i∈I Ei
⇔ (Dom R ⊂ I ∧ ∀i∈I,
R⃗(i) = Ei).
It is also called disjoint union as the family of copies {(i,x) |
x∈Ei} of each Ei is
pairwise disjoint, with the function from R to I given by π0.
Direct and inverse images by a graph
The restriction of a graph R to a set A is defined as
R|A =
{(x,y)∈R | x∈A} = ∐x∈A
R⃗(x)
The direct image of a set A by a graph R is
R⋆(
A) = Im
R|A =
⋃
x∈A R⃗(
x) = {
y |(
x,
y)∈
R∧
x∈
A} ⊂ Im
R.
Dom
R ⊂
A ⇔
R|A =
R
⇒
R⋆(
A) = Im
R
R⋆(⋃
i ∈ I
Ai) = ⋃
i ∈ I
R⋆(Ai)
R⋆(⋂
i ∈ I
Ai) ⊂ ⋂
i ∈ I
R⋆(Ai)
A ⊂
B ⇒
R⋆(
A) ⊂
R⋆(
B)
Similarly, the inverse image or preimage of a set B by a graph R is
R⋆(B) = tR⋆(B) =
⋃y∈B R⃖(y)
= {x | (x,y)∈R∧ y ∈ B} ⊂ Dom R.
Direct image and preimage by a function
The direct image of a subset A ⊂ Dom f by a function f, denoted
f[A] or f⋆(A), is
f[A] = (Gr f)⋆(A)
= {f(x) | x∈A} ⊂ Im f
For any f : E → F and B ⊂ F,
the preimage of B by f, written f⋆(B),
is defined by
f⋆(B) =
(Gr f)⋆(B) = {x∈E | f(x) ∈ B} = ⋃
y∈B
f•(y)
f•(y) = f⋆({y})
f⋆(∁
F B) = ∁
Ef⋆(
B)
For any family (Bi)i ∈ I
of subsets of F, f⋆(⋂i∈I
Bi) = ⋂i∈I
f⋆(Bi) where
intersections are respectively interpreted as subsets of F
and E.
2.7. Products and powerset
Cartesian product of two sets
For any two sets E and F, the (cartesian) product E×F
is the set of (x,y) where x∈E and y∈F.
E×
F = ∐
x∈E F
∐
i∈IEi
= ⋃
i∈I {i}×Ei ⊂ I×⋃
i∈I
Ei
(
E⊂
E′ ∧
F⊂
F′) ⇒
E×
F ⊂
E′×
F′
E×∅ = ∅ = ∅×
E
For any graph R,
R ⊂ E×F ⇔ (Dom R ⊂ E ∧ Im R
⊂ F)
R|E = R∩(E× Im R)
Quantifiers over a product have equivalent curried views:
(∃(x,y)∈E×F, A(x,y))
⇔ (∃x∈E, ∃y∈F, A(x,y))
⇔ (∃y∈F, ∃x∈E, A(x,y))
(∀(x,y)∈E×F, A(x,y))
⇔ (∀x∈E, ∀y∈F, A(x,y))
⇔ (∀y∈F, ∀x∈E, A(x,y))
(∃x∈E, A(x)∨B(x))
⇔ ((∃x∈E, A(x)) ∨ (∃x∈E, B(x)))
(∀x∈E, A(x)∧B(x))
⇔ ((∀x∈E, A(x)) ∧ (∀x∈E, B(x)))
(∃x∈E, C∨A(x)) ⇔ ((C ∧ E≠∅) ∨
∃x∈E, A(x))
(∀x∈E, C∧A(x)) ⇔
((C ∨ E=∅) ∧ ∀x∈E, A(x))
Quantifiers (Qx,y∈E, ) can be equivalently written
(Q(x,y)∈E×E, ) while the notation Qx≠y∈E
can be read as one quantifier with domain {(x,y)∈E×E |
x≠y} = (E×E) \ 𝛿E.
Finite products, operations and relations
Beyond the binary operator of cartesian product, there is an n-ary product operator between
any number n>1 of sets : E0×…×En−1 is the
set of all n-tuples (x0,…,xn−1) such that
∀i∈Vn, xi∈Ei.
For example, the ternary product isE×F×G =
⋃x∈E⋃y∈F {(x,y,z) |
z∈G} = {(x,y,z) | ((x,y),z))∈(E×F)×G}
where the latter notation abbreviates the use of the evaluators for this representation of triples. It
can also be seen as directly justified by the set generation principle:
(∀(x,y,z)∈E×F×G,
A(x,y,z)) ⇔ (∀x∈E, ∀y∈F,
∀z∈G, A(x,y,z))
This finally justifies the most natural formalization of n-ary operations, as functions
with domain a product of n sets, with evaluator given in
2.3. It justifies our first notation of 2.3 for operation
definers, as a use of the abbreviated notation of 2.6 for binders on graphs, extensible to any arity.
The notion of n-ary relation R will usually be formalized as that of a set
of n-tuples, subset of the product of its domains.
This way, a binary relation between sets E and F will be a graph
R ⊂ E×F. The equality relation on a set E is so formalized as
𝛿E, called the diagonal of E×E.
For a formalization which specifies the domains E and F of arguments, we can take
the triple (E,F,R).
Translating operators into predicates
The use of any functor symbol K in a generic theory is equivalent to that of
a binary predicate symbol R with the functionality axiom
∀x, ∃!y, xRy
It is meant as the "graph of K", i.e. defined from K by
xRy ⇔ y = K(x)
Terms of first-order logic using K cannot be re-expressed using R, but
formulas can: each formula with each occurrence of K in it, can be written
A(K(x)) where x abbreviates a term and A abbreviates a
formula as a unary predicate; then equivalently replaced by either of
∃y, xRy ∧ A(y)
∀y, xRy ⇒ A(y)
The same goes for other arities : the role of any n-ary
operator can be played by an (n+1)-ary predicate with the functionality axiom.
This generalization can be explained as replacing the argument by a tuple.
This construction does not fit for the operators of our set theory, due to its use of open quantifiers.
Yet it works when domains are sets: n-ary operations can be represented as functional
(n+1)-ary graphs.
More primitive symbols
Let us strengthen
set theory by 3 extensions that declare more kinds of classes to be sets : powersets, exponentiations and
products. These extensions are equivalent, in the sense that accepting anyone of them as primitive
suffices to get both others as developments from it. They are similar to those provided by the set generation principle, but no more
fit its condition ; this lack of justification has deep connections with incompleteness aspects of mathematics.
To fit our way of formalizing set theory (unlike the ZF way introduced below),
each such extension is made of - A symbol K (namely, two operators and one binder)
naming as a set the given class C, with arguments the parameters of C
here abbreviated as a family y ranging over the intended definiteness class of K;
- The statement ∀dK y, Set K(y) ∧ ∀x,
x ∈ K(y) ⇔ Cy(x).
Powerset. ∀Set E, Set ℘(E)
∧ ∀F, F∈℘(E) ⇔ (Set F ∧ F
⊂ E)
We shall abbreviate ∈℘ as ⊂ also in binders: for example
(∀F⊂E,…) ⇔ (∀F∈℘(E),…).
Exponentiation. ∀Set E,F,
Set FE ∧ ∀f, f ∈ FE ⇔
f : E → F
Product of a family of sets. The binder ∏ generalizes the finite product operators
to any family of sets (∀i∈I, Set Ei):
∀x, x ∈ ∏
i∈I Ei
⇔ (Fnc(x) ∧ Dom x = I ∧ ∀i∈I,
xi ∈ Ei).
Like with tuples, the function evaluator curried
in the reverse way defines a family of projections πi
(keeping implicit their dependence on the given product which is usually fixed in context):
∀i∈I, πi :
∏j∈I Ej → Ei.
These symbols preserve inclusion as follows
F ⊂ E ⇒ ℘(F) ⊂ ℘(E)
F ⊂ E ⇒ FG ⊂ EG
(∀i∈I, Fi ⊂ Ei) ⇒
∏i∈I Fi
⊂ ∏i∈I Ei
Their equivalence
From any of these 3 symbols, both others can be defined as follows.
∏
i∈I
Ei = {x∈ (⋃
i∈I Ei)I
| ∀i∈I, xi ∈ Ei} =
{℩R⃗
| R ⊂ ∐
i∈I
Ei ∧ ∀i∈I, ∃!: R⃗(i)}
FE = ∏x∈E F = {℩R⃗
| R ⊂ E×F ∧ ∀x∈E, ∃!: R⃗(x)}
℘(E) = {f•(1) | f ∈ V2E}
Even some cases are expressible from previous tools:
F{a}= {{a}∋x↦y
| y ∈ F}
F∅ = ∏∅↦ = {∅↦}
℘({a}) = {∅,{a}}
∏
i∈I∪J
Ei = {(i∈I ? xi
: yi)i∈I∪J
| (x,y) ∈ ∏
i∈IEi × ∏
i∈JEi}
(∃i∈I, Ei = ∅) ⇒ ∏
i∈IEi = ∅
(∀i∈I, ∃!:Ei) ⇒ ∏
i∈IEi
= {(℩Ei)i∈I}
Cantor's Theorem
This famous theorem is simply expressible as
∀Fnc f, ℘(Dom f) ⊄ Im f.
Proof for the case when all f(x) are sets (to which the general case is reducible):
(E = Dom f ∧ F = {x∈E| x ∉ f(x)})
⇒ (∀x∈E, x ∈ F ⇎ x ∈ f(x))
⇒ (∀x∈E, F ≠ f(x)) ⇒ F
∉ Im f.∎
Russell's paradox
may be seen as the particular case of Cantor's theorem for f = IdE.
The ZF approach
In the traditional ZF formalism for set theory, each such extension is formalized as first made of a mere
statement without any special symbol, keeping ∈ as the only primitive structure:
∀...y, ∃X, ∀x,
x ∈ X ⇔ Cy(x).
Namely,
the powerset axiom (existence of the powerset) is written ∀E,
∃X, ∀F, F∈X ⇔ F ⊂ E, from which the
statements of existence of exponentiations and products come as theorems.
Then, the uses of each symbol K with argument y as above, naming these sets,
can be justified as developed from there as abbreviations by the following
rules.
- ∀x∈K means ∀x, C(x)⇒ ;
- The equality (X=K) means (∀x,
x ∈ X ⇔ C(x)) or equivalently (∀x∈X,
C(x)) ∧ (∀C x, x ∈ X);
- Any other formulas using K with y free, written as A(K), mean
the formula so abbreviated as ∃X, (X=K)∧A(X).
- To justify binding y on terms using K, involves the axiom schema of
replacement.
But our framework does not even allow the first of these cases, because it involves
an open quantifier, which cannot be replaced by a bounded
formula because these extensions do not fit the condition of the set generation principle.
This impossibility to interpret the equality formula (X=K) means that even a
given set identical to a class C, would not be recognizable as such.
Hence the need to accept the symbols K as primitive, for
giving effective sense to the statement that C coincides with a set.
2.8. Injections, bijections
Composition, restriction
The composite of any 2 functions f,g such that Im f
⊂ Dom g is defined by
g⚬f = ((Dom f) ∋ x ↦
g(f(x)))
(f : E →
F ∧ g : F → G) ⇒ g⚬f
: E → G
(This notation will be abusively kept if g is a functor, in which case
this is a schema of definitions).
Similarly with more than 2 functions: adding a third one h :
G →H to the above conditions, we define
h⚬g⚬f
= (h⚬g)⚬f = h⚬(g⚬f) = (E
∋ x ↦ h(g(f(x))))
The restriction of a function f to a set A ⊂ Dom f is
f|A =
(A ∋ x ↦ f(x)) = f⚬IdA.
Gr(f|A) = (Gr f)|A
f[A] = Im(f|A)
∀Fnc f,g, Gr g ⊂ Gr f
⇔ (Dom g ⊂ Dom f ∧ g = f|Dom g)
When g is a restriction of f, we also say that f is an extension of
g.
Injections, bijections, inverse
A function f is called injective
(or : an injection ), if
tGr f is a functional graph:
Inj f ⇔ |
(∀x≠x′∈Dom f, f(x) ≠
f(x′)) |
⇔ |
(∀x,x′∈Dom f, f(x) = f(x′)
⇒ x=x′) |
(f : E ↪ F) ⇔ |
(f : E → F ∧ Inj f) |
Im f ⊂ F ⇒ |
(Inj f ⇔ ∀y∈F, !:f•(y)) |
The inverse of any injection f, is the function
f -1 with graph tGr f:
f -1 = ℩f•
= (Im f ∋ y ↦ ℩f•(y))
Gr(f -1) = tGr f
(f -1)-1 = f.
A function f : E → F is said bijective
(or a bijection) from E onto F, if it is both injective and surjective :
f : E ↔ F ⇔ |
(f : E → F ∧ ∀y∈F,
∃!: f•(y)) |
⇔ |
(f : E ↪ F ∧ Im f = F) |
⇔ |
(f : E ↠ F ∧ Inj f) |
⇒ |
f -1: F ↔ E. |
In particular IdE : E ↔ E.
Diverse properties
Proposition. Let f : E → F
and g : F → G. Then-
(Inj f ∧ Inj g) ⇒ Inj(g⚬f)
- Inj(g⚬f) ⇒ Inj f
- Im(g⚬f) = g[Im f] ⊂ Im g.
Proofs:
- ∀x,y∈E,
g(f(x)) = g(f(y)) ⇒
f(x) = f(y) ⇒ x = y.
- ∀x,y ∈ E, f(x) = f(y)
⇒ g(f(x)) = (g(f(y)) ⇒
x = y.
- ∀z∈G, z ∈ Im(g⚬f) ⇔
(∃x∈E, g(f(x))=z) ⇔
(∃y∈ Imf, g(y)=z) ⇔ z
∈ g[Imf]. ∎
From 3. we can obviously deduce
Im(g⚬f) = G ⇒ Im g = G
Im f = F ⇒ Im(g⚬f) = Im g
(f:E ↠ F) ⇒ (g:F
↠ G ⇔ g⚬f:E ↠ G)
Let f : E → F, and a function g with Dom g = F.
Then, easily,
g⚬f
= IdE |
⇔ (∀x∈E, ∀y∈F,
f(x) = y ⇒ g(y) = x)
|
|
⇔ Gr f ⊂ tGr g
⇔ (∀y∈F, f•(y)⊂{g(y)}) |
|
⇔ (∀x∈E,
f(x)∈g•(x))
⇔ (Inj f ∧ g|Im f = f -1) |
|
⇒ (Im f = F ⇔
g = f -1 ⇔ (Inj g ∧ Im g ⊂ E)) |
Proof of the non-obvious step : (g⚬f
= IdE ∧ Inj g ∧ Im g ⊂ E) ⇒
(∀y∈F, g(f(g(y))) =
g(y) ∴ y = f(g(y)) ∈ Im f)
Proposition. (f : E → F ∧ g : F
→ E ∧ g⚬f = IdE) ⇒ (Im f = F ⇔
Inj g ⇔ f⚬g = IdF ⇔ g = f -1).
Proof : (Inj g
∧ g⚬f⚬g = g⚬IdF)
⇒ f⚬g = IdF.
∎
Proposition. 1) If f,g are injective and Im f = Dom g,
then (g⚬f)-1 = f -1⚬g-1.
2) If f,h : E→F and g : F→E then
(g⚬f = IdE ∧ h⚬g = IdF)
⇔ ((g : F ↔ E) ∧ f = h = g−1).
Proofs:
1) ∀x∈Dom f, ∀y∈ Im g, (g⚬f)(x) = y
⇔ f(x) = g-1(y) ⇔ x =
(f -1⚬g-1)(y).
Other method: (g⚬f)-1
= (g⚬f)-1⚬g⚬g-1 =
(g⚬f)-1⚬g⚬f⚬f -1⚬g-1
= f -1⚬g-1.
2) We deduce f = h from f = h⚬g⚬f
= h, or from Gr f ⊂ tGr g
⊂ Gr h. The rest is obvious.∎
Canonical functions
For any objects x,y we shall say that x
determines y if there is an invariant functor T (known but
kept implicit) such that T(x) = y. Then the role of y as
free variable can be played by the term T(x), thus by
the use of x.
Similarly, a function f : E → F
will be said canonical if it is defined as E ∋ x ↦ T(x)
where T is some invariant functor.
Examples of important canonical injections
:
- E ⊂ F ⇒
IdE : E ↪ F
- (x ↦ {x}) : E ↪ ℘(E)
- Gr : FE ↪ ℘(E×F).
A bijection f will be
said bicanonical if both f and f−1
are canonical.
When a bijection f : E ↔ F
is canonical (resp. bicanonical), we shall write f : E ⥬ F
(resp. f : E ⇌ F); or, with its defining
functor φ as φ : E ⥬ F which means that (E
∋ x ↦ φ(x)) is injective with image F.
We shall write E ⥬ F (resp. E ⇌ F)
to mean the existence of a canonical (resp. bicanonical) bijection,
that is kept implicit. These will often look
like equality properties on numbers, as the existence of a bijection between
finite sets implies the equality of their numbers of elements.
Canonical bijections can fail to be bicanonical especially when their defining functor
is not injective:
V2E ⥬ ℘(E)
E ≠ ∅ ⇒ {x}E ⥬ {x}
E×{x} ⥬ E
E×F ⇌ F×E
E×{x} ⇌ {x}×E
⇌ E{x}
E×{0} ⇌ E.
This meta-relation between sets is preserved by
constructions: if E ⥬ E′ and F ⥬
F′ then ℘(E) ⥬ ℘(E'), E×F ⥬
E′×F′ and FE
⥬ F′E′ (using the direct image of the graph), and the same for ⇌.
This is how transposition (E×F ⇌ F×E) extends to both
graphs and operations:
℘(E×F) ⇌ ℘(F×E)
GE×F ⇌ GF×E
∀f∈GE×F, tf
= (F×E∋(x,y) ↦ f(y,x)) ∈
GF×E.
Sums of functions
The binder ∐ of sum of sets
defines canonical bijections (whose inverses, defined by R ↦
R⃗I, depend on I):
(℘(E))I ⥬ ℘(I×E)
∏i∈I ℘(Ai)
⥬ ℘(∐i∈I Ai)
Any sum of sets S = ∐i∈I Ai
has its family of natural injections ji given by the curried
ordered pair definer :
∀i∈I, ji : Ai
↪ S.
The sum over I of functions fi
where ∀i∈I, Ai = Dom fi is defined by
∐i∈I fi
= (S ∋ (i,x) ↦ fi(x))
g = ∐i∈I fi ⇔
(Dom g = S ∧ ∀i∈I, fi =
g⚬ji = g⃗(i))
It is the inverse of the currying of binary operations, which we shall denote like for graphs:
for any sets E, F and any operation (function) B with domain E×F ≠ ∅
(from which E and F can be restored as E = Dom Dom B and
F = Im Dom B, while if E×F = ∅ the notation can be abusively
kept while taking E and F from context),
B⃗ = (E ∋ x ↦
(F ∋ y ↦ B(x,y)))
B⃖ = tB⃗ = (F ∋ y ↦
(E ∋ x ↦ B(x,y))
Hence the canonical bijections (bicanonical if I = Dom S, thus if E ≠ ∅ in the
case I×E)
∀SetF, ∏
i∈I
FAi ⥬ FS
∀SetE,F,
(FE)I ⥬ FI×E
(∀F : S → Set) ∏
i∈I
∏
x∈Ai
F(i,x) ⥬ ∏
c∈SFc
(E×F)×G ⇌ E×F×G
f ∈ (FE)I ⇒ ∐
i∈I
Gr fi ⊂ ℘(I×(E×F))
⇌ ℘((I×E)×F) ⊃ Gr ∐
i∈I
fi
Product of functions or recurrying
Both curried forms R⃗ and R⃖ of graphs R ⊂
E×F are exchanged by a bijection ℘(F)E
↔ ℘(E)F defined with parameter F.
The similar bijection between both curried forms of operations with a given domain
I×E, is defined by the binder ⊓ of product of functions all with domain
E, and indexed by I :
(∀i∈I, Dom fi =
E) ⇒ ⊓
i∈I
fi = (E ∋ x ↦
(fi(x))i∈I)
∀g, g = ⊓
i∈I
fi ⇔ (g : E →
∏
i∈I
Im fi ∧ ∀i∈I, fi
= πi⚬g)
(F : I → Set) ⇒ ⊓ : ∏
i∈I
FiE ⥬ (∏
i∈I
Fi)E
Set F ⇒ ⊓ : (FE)I ⥬
(FI)E
These bijections are canonical for nonempty families (I
≠ ∅) as only these determine E, and are usually bicanonical as their inverse uses
I which is definable when E ≠ ∅; anyway we shall abusively keep the
notation ⥬ for the general case while E remains fixed by the context.
A particular case is the binary product (I = V2):
∀Fncf,g, Dom f = Dom g = E ⇒ f⊓g =
(E ∋ x ↦ (f(x),g(x)))
IE×FE ⇌
(I×F)E
(∐
i∈I
Fi)E ⇌
∐
ϕ∈IE ∏
x∈E
Fϕ(x)
∀Fncf, Gr f = Im(IdDom f ⊓ f)
2.9. Binary relations on a set
A binary relation on a set E is a relation with both domains equal to E,
thus formalized by a graph R ⊂ E×E. Let us abbreviate
E×E as E2 and (x,y)
∈ R as x R y
Preimages and products
For any function f : E → F and any binary relation R on
F, the preimage of R by f
is the binary relation f⋆(R) on E defined as
{(x,y)∈E2 | f(x)
R f(y)}
For any family of a binary relations (Ri)i∈I
on respective sets Fi,
their product is the binary relation on P = ∏i∈I
Fi defined as
{(x,y)∈P2 | ∀i∈I,
xi Ri
yi} = ⋂i∈I
πi⋆(Ri).
(It is actually in canonical bijection with the product of sets
∏i∈I Ri)
In particular with a constant family : any binary relation on a set F defines
one on any FI.
Both operations can be combined in one step: for any family of functions
fi : E → Fi, the relation
{(x,y)∈E2 | ∀i∈I,
fi(x) Ri fi(y)}
= ⋂i∈I fi⋆(Ri)
is the preimage of the product relation of the Ri on P, by the function
⊓i∈I fi : E → P.
These concepts will be later generalized to other first-order structures.
Basic properties
A binary relation R on a set E will be said
reflexive ⇔ |
∀x∈E, xRx ⇔ 𝛿E
⊂ R ⇒ Dom R = Im R = E |
irreflexive ⇔ |
∀x∈E, ¬(xRx) |
symmetric ⇔ |
(∀x,y∈E, xRy ⇒ yRx)
⇔ R ⊂ tR
⇔ R = tR ⇔ (R⃗)
= (R⃖) |
antisymmetric ⇔ |
∀x,y∈E, (xRy ∧
yRx) ⇒ x = y |
transitive ⇔ |
∀x,y,z∈E, (xRy
∧ yRz) ⇒ xRz |
These properties are preserved by transposition, preimages and products,
except that the preimage of an antisymmetric relation by a non-injective function
may not be antisymmetric, and the empty product (I=∅) is not irreflexive.
In particular for F ⊂ E, the restriction of a relation R ⊂
E×E to F, namely R∩(F×F), is its preimage
by IdF : F ↪ E, thus preserves all properties.
Example. Let A ⊂ EE and R =
⋃f∈A
Gr f. Then
(IdE
∈ A) ⇒ |
R is reflexive |
(∀f,g∈A, g⚬f
∈ A) ⇒ |
R is transitive |
(∀f∈A, f : E↔E ∧
f -1∈ A) ⇒ |
R is symmetric |
For any transitive binary relation R we denote x R
y R z
⇔ (x R y ∧ y R z)
⇒ x R z.
Preorders and orders
A preorder is a reflexive and transitive binary relation. An order is
an antisymmetric preorder.
A preordered set is (an ordered pair of) a set with a chosen preorder on it.
An ordered set is a set with a chosen
order, usually written as ≤ or ≤E. The formula x ≤ y
can be read «x is less than y»,
or «y is greater than x».
Two elements x, y of an ordered set are said comparable
when (x≤y ∨ y≤x).
On the class of sets, the meta-relation ⥬ (existence of a canonical bijection) is a meta-preorder.
The Boolean pair V2 is naturally ordered by ⇒ which coincides
there with the usual order between numbers.
The resulting product order on V2E ⥬ ℘(E)
(and thus any set of sets) is that of inclusion (⊂).
Any relation R ⊂ E×F defines a preorder on E
as the preimage by R⃗ of the inclusion order:
{(x,y)∈E2 |
R⃗(x) ⊂ R⃗(y)}.
Conversely, any preorder R on a set E is so definable from itself in both
ways (exchanged by transposition):
∀x,y∈E, R⃗(y) ⊂
R⃗(x) ⇔
x R y ⇔ R⃖(x) ⊂ R⃖(y)
Proof. Transitivity is equivalent to (∀x,y∈E,
x R y ⇒ R⃗(y) ⊂
R⃗(x)) and to (∀x,y∈E,
x R y ⇒ R⃖(x) ⊂
R⃖(y)).
Reflexivity is equivalent to the converses:
∀x∈E, (∀y∈E, R⃗(x) ⊂
R⃗(y) ⇒ y R x) ⇔
x R x ⇔ (∀y∈E, R⃖(x)
⊂ R⃖(y) ⇒ x R y).
∎
Thus R is the preimage of ⊂ by (R⃖). Moreover,
∀x,y∈E, R⃖(x) =
R⃖(y) ⇔ (R⃖(x)
⊂ R⃖(y) ∧ R⃖(y)
⊂ R⃖(x)) ⇔ (xRy
∧ yRx)
so that (R⃖) is injective if
and only if R is an order.
The idea of considering only one order written ≤ on each set E, may be formalized by
replacing the ordered set (E,R) by (Im(R⃖), ⊂), though this
does not exactly fit the following convention : any subset of an ordered set will be implicitly also
ordered by restriction.
Strict and total orders
A strict order is a binary relation both
transitive and irreflexive; and thus also antisymmetric.
Strict orders < bijectively correspond to orders ≤ by x < y
⇔ (x ≤ y ∧ x ≠ y).
x ≤ y ⇔
(x < y ∨ x = y).
A total order on a set E
is an order where all elements are comparable
(∀x,y∈E,
x ≤ y ∨ y ≤ x), i.e. (≤ ∪ t≤)
= E×E.
Equivalent definitions:
- an order ≤ related with its strict order < by ∀x,y∈E,
x < y ⇎ y ≤ x.
- a transitive relation ≤ such that ∀x,y∈E,
x ≤ y ⇔ (y ≤ x ⇒ x=y).
A strict total order on a set E is a strict order associated with a total order, i.e.
any transitive relation < such
that, equivalently ∀x,y∈E, x < y
⇎ (y < x ∨ x = y)
∀x,y∈E, x = y ⇎
(y < x ∨ x < y)
Equivalence relations
An equivalence relation is a symmetric preorder.
On any set E, the equality
relation 𝛿E is an equivalence relation (as already expressed by the
axioms of equality), and the
only relation altogether symmetric, antisymmetric and reflexive.
On any product of sets, the product of equality relations is also equality.
The preimage of equality by any f : E → F is an equivalence
relation on E we shall write ∼f :
∼f = {(x,y)∈E |
f(x) = f(y)} = ∐(f•⚬f)
Inj f ⇔ ∼f = 𝛿E
Conversely, any equivalence relation R is so definable from itself : R =
∼R⃗. Indeed
∀x,y∈E, x R y ⇔
R⃗(y) ⊂
R⃗(x) ⇔ R⃗(x) ⊂
R⃗(y) ⇔ R⃗(x) = R⃗(y)
Quotient functions
Proposition. ∀SetE,F,G, ∀f, (f :
E ↠ F) ⇒ (GF ∋ g ↦ g⚬f) :
GF ↔ {h∈GE |
∼f ⊂ ∼h}.
Proof : ∀...∀f∈FE,
∀h∈GE, H = Im(f×h) ⇒
∼f ⊂ ∼h ⇔ (∀x,x'∈E,
f(x) = f(x′) ⇒
h(x) = h(x′)) ⇔ (∀(y,z),
(y′,z′) ∈ H, y=y′ ⇒ z=z′)
Dom H = Im f ∧
∀g∈GF, h = g⚬f ⇔ H
⊂ Gr g. ∎
For any functions f,h such that Dom f = Dom h ∧ ∼f
⊂ ∼h, this unique g : Im f ↠ Im h such that
h = g⚬f
will be called the quotient of h by f:
h/f = (Im f ∋y↦ ℩h[f•(y)])
Gr (h/f) = Im (f×h)
Inj f ⇒ (f -1 =
IdDom f /f ∧ h/f = h⚬f -1)
∼f
= ∼h ⇔ Inj(h/f) ⇒ (h/f)-1 = f/h
In the more general case Im f ⊂ Dom g (without
surjectivity of f),
h = g⚬f ⇒ (∼f ⊂ ∼h
∧ h/f = g|Im f)
⇒ (∼f = ∼h
⇒ f = (g|Im f)−1⚬h)
The formula that ∼f is an equivalence relation, ∀x,y ∈ Dom f,
x ∼f y ⇔
∼⃗f(x) = ∼⃗f(y)
where (∼⃗f)
= f•⚬f, reflects the injectivity of (f•)
which can be directly seen as
∀y∈Im f, ∀z, f•(y)
= f•(z) ⇒ f•(y) ∩ f•(z)
= f•(y) ≠ ∅
⇒ y = z.
Partitions
A partition of a set E is a set of nonempty, pairwise disjoint sets, whose
union is E.
In other words, it is a set P of subsets of E such that IdP is an
indexed partition of E.
The image of any indexed partition is a partition of the same set: for
any function f with image E, Im(f•) =
f•[E] = Im(f•⚬f) is a partition
of E.
∀R ⊂ E×E, if P
= R⃗[E] then
(∀x,y∈E,
x∈R⃗(y) ⇔
R⃗(x) = R⃗(y))
⇔ (∀x∈E, ∀A∈P, x∈A ⇔
R⃗(x) = A) ⇔
IdP = (R⃗E•)
Thus if R is an equivalence relation then P is a
partition.
Inversely, any partition P of E defines a function
g and then a graph R by
Dom g = E ∧ (g•) = IdP
∴ P = Dom (g•)
= Im g
R = ∐g ⊂ E×E ∴ g
= R⃗E ∴ (P
= R⃗[E] ∧ IdP =
(R⃗E•))
thus R is an equivalence relation on E and
∀x,y∈E,
xRy ⇔ y∈ ℩{A∈P | x∈A} ⇔
(∃A∈P, x∈A ∧ y∈A) ⇔ (∀A∈P,
x∈A ⇒ y∈A).
For any equivalence relation R on E, the partition
R⃗[E] = Im (R⃗) is
called the quotient of E by R, and written E/R, so that
(R⃗) : E ↠ E/R.
For any x ∈ E, the set R⃗(x) = ℩{A∈E/R
| x∈A} is called the class of x by R.
For any function f such that Dom f = E ∧ R
⊂ ∼f, let us abbreviate f/(R⃗) as f/R:
f/R : E/R ↠ Im f
Inj f/R ⇔
R = ∼f
So when R = ∼f the role of E/R may be played by
Im f.
In particular, any preorder R on a set E gives an equivalence relation
R∩tR, and the role of
E/(R∩tR) may be played by Im (R⃖).
2.10. Axiom of choice
Properties of curried composition
Proposition. ∀Set E,F,G,
∀f ∈ FE,
- Im f = F ⇒ Inj(GF ∋ g ↦ g⚬f)
- (Inj(GF ∋ g ↦ g⚬f) ∧ ∃≥2:G)
⇒ Im f = F
- (Inj f ∧ G ≠ ∅)
⇒ {g⚬f | g∈GF} = GE
- ({g⚬f | g∈GF} = GE
∧ ∃≥2:G) ⇒ Inj f
- (E ⊂ F ∧
G ≠ ∅) ⇒ {g|E |
g∈GF} = GE
- (Inj f ∧ E ≠ ∅) ⇒ ∃g∈EF,
g⚬f = IdE
Proofs :
1. done in 2.9 ; simpler proof:
∀g,h∈GF,
(∀x∈E, g(f(x)) = h(f(x)))
⇔ (∀y∈F, g(y) = h(y)) ⇔ g=h
2. ( ) ⇒ ∃z≠z′∈G, (F∋y ↦ z)⚬f =
(F∋y ↦ (y ∈ Im f ? z : z′))⚬f
∴ (∀y∈F, y ∈ Im f ∨ z = z′) ∴
Im f = F
3. Inj f ⇒ ∀z∈G, ∀h∈GE, h =
(F ∋ y ↦ (y ∈ Im f ?
h(f -1(y)) : z))⚬f
4. ∀z≠z′∈G, ∀x∈E, ∀g∈GF,
g⚬f = (E∋y↦
(y = x ? z : z′)) ⇒ ∀y∈E, f(y) = f(x) ⇒ g(f(y))
= g(f(x)) = z ⇒ y = x.
5. and 6. are particular cases of 3.∎
The case G = V2 gives the following properties of h =
(℘(F)∋B↦ f⋆(B)) :
Im f = F ⇔ Inj h
Inj f ⇔ Im h
= ℘(E)
Moreover {f[A]|A ⊂ E} = ℘(Im
f) because ∀B
⊂ Im f, f[f⋆(B)] = B.
Proposition. For any sets E, F and any function g with
domain F,
- Inj g ⇒
Inj(FE
∋ f ↦ g⚬f)
- (Inj(FE
∋ f ↦ g⚬f) ∧ E ≠ ∅) ⇒ Inj g
- ∀SetG, ({g⚬f | f∈FE} =
GE ∧ E ≠ ∅) ⇒ Im g = G.
Proofs:
1. ∀f,f'∈FE, (Inj g ∧
∀x∈E, g(f(x)) = g(f'(x)))
⇒ (∀x∈E, f(x) = f'(x)).
2. ∀y,z∈F, g(y) =
g(z) ⇒ g⚬(E ∋ x ↦ y) =
g⚬(E ∋ x ↦ z) ⇒ (∀x∈E,
y = z) ⇒ (E ≠ ∅ ⇒ y = z).
3. (left as an exercise to the reader).∎
Axiom of choice over a set (ACX)
The axiom of choice (AC) is a statement expressible in any set theory with
powerset, which «feels true» and is convenient to accept as an axiom; but we will
actually not need it in the rest of this work.
It has several equivalent formulations (in clear, it is the
common name of these equivalent statements). The main ones fit the common format
(∀SetX, ACX) where
ACX means any of the following equivalent formulas with
the free variable X in the class of sets:
- (AC1X) ∀SetE, ∀R⊂X×E,
(∀x∈X, ∃y∈E, xRy) ⇒ (∃f∈
EX, ∀x∈X, xRf(x)).
- (AC2X) ∀A : X → Set, (∀x∈X,
Ax ≠ ∅) ⇒ ∏A ≠ ∅
(The product of any family of nonempty sets indexed by X is nonempty)
- (AC3X) ∀Fnc g, X ⊂ Im g
⇒ ∃f ∈ (Dom g)X, g⚬f = IdX
-
(AC4X) ∀grR, X ⊂ Dom R
⇒ ∃f ∈ (Im R)X, Gr f
⊂ R.
(AC3X) and (AC4X)
are more often used in cases of equality (respectively X = Im g and X = Dom R)
which are equivalent to the general cases, as the proofs of equivalence work the same :
- (AC1X)⇒(AC2X) by R =
∐ A ∧ E = Im R ;
- (AC2X)⇒(AC3X) by
A = (g•(x))x∈X ;
- (AC3X)⇒(AC4X) :
X ⊂ π0[R] ⇒ ∃(h×f) ∈ RX,
h = IdX ∴ Gr f = Im (h×f) ⊂ R ;
- (AC4X)⇒(AC1X) obvious.∎
Other easy deductions are - (AC2X)⇒(AC1X)
by Ax = R⃗(x) ;
- (AC4X)⇒(AC3X) by
R = tGr g.
Dependencies between diverse ACX
A big achievement of mathematical logic was to prove the independence of AC
from the rest of set theory: even the very strong ZF system,
if consistent, can neither prove nor refute it, as each universe where AC is true includes
another one where it is false, and vice versa. This will be further commented in 5.5
(without the details of the proofs which are much too difficult).
The diverse ways in which AC may fail in different universes, can be partly sorted out by
specifying for which X does ACX hold.
The different ACX depend on each other as follows:
- ∀SetX,Y, (ACX ∧
∃f, f : Y ↪ X) ⇒ ACY
- ∀SetX,Y, (ACX ∧ ACY) ⇒
(ACX∪Y ∧ ACX×Y)
- (ACI ∧ ∀i∈I, ACAi)
⇒ AC∐i∈I Ai
The proofs of these implications are easy and left as an exercise to the reader.
Finite choice theorem. If X is finite then ACX.
This will be easily deducible from the case of singletons AC{x} (which is trivial)
and the preservation by binary union (ACX ∧ ACY) ⇒
ACX∪Y once finiteness will be formalized (4.6).
Now let us just suggest a schema of proofs, namely one for each number of
elements of X. For example the case of X with 3 elements comes down to
(AC2V3) whose proof can be written
∀SetX,Y,Z,
(X ≠ ∅ ∧ Y ≠ ∅ ∧ Z ≠ ∅) ⇒ (∃x∈X, ∃y∈Y,
∃z∈Z, (x,y,z) ∈ X×Y×Z)
⇒ X×Y×Z ≠ ∅.∎
More statements simply equivalent to AC
Theorem. The following statements are equivalent
to the axiom of choice:
(AC5) Any set E of nonempty sets has a choice function: ∅ ∉ E ⊂ Set
⇒ ∏ IdE ≠ ∅.
(AC6) For any partition P of a set E,
∃K⊂E, ∀A∈P, ∃!: K∩A.
(AC7) For any sets E,F,G and any
g: F ↠ G, {g⚬f | f
∈ FE} = GE.
Proofs:
(AC2)⇒(AC5) is obvious ;
(AC5)⇒(AC6) : (x∈ ∏ IdP ∧ K
= Im x) ⇒ (K⊂E ∧ ∀A∈P,
xA ∈ K∩A ∧ ∀B∈P,
xB ∈ A ⇒ A∩B ≠ ∅ ⇒ A = B)
(AC6)⇒(AC3) : ∀Fnc g, X ⊂ Im g ⇒ ∃K⊂Dom g,
(∀A∈ Im(g•), ∃! : K∩A)
∴ ∃f ∈ (Dom g)X, (∀x∈X, {f(x)}
= K∩g•(x)) ∴ g⚬f = IdX.
(AC1E)⇒(AC7) : ∀h∈GE,
(∀x∈E, ∃y∈F, g(y) = h(x)) ⇒
(∃f∈FE, ∀x∈E, g(f(x)) = h(x))
(AC3G)⇒(AC7) : ∃j∈FG,
g⚬j = IdG ∴ ∀h∈GE,
j⚬h ∈ FE ∧ g⚬j⚬h = h.
(AC7)⇒(AC3) : ∀SetX, ∀Fncg,
X = Im g ⇒ IdX
∈ XX = {g⚬f | f ∈ (Dom g)X}. ∎
Note: (AC5)⇒(AC2) is also easy : ∅ ∉ Im A ⊂ Set ⇒ ∃f ∈ ∏ IdIm A,
f⚬A ∈ ∏ A.
There are many more statements equivalent to AC, which will not be of any use in this work,
and for which the proofs of equivalence are much harder. Interested readers can find about
them elsewhere.
2.A. Time in set theory
Standard universes
Given two universes
U1, U2, with U1 ⊂
U2, we shall define 3 distinct versions for the concept of
U1 being standard in U2,
also called a sub-universe of U2. We first informally introduced
such a concept in 1.7
as the preservation of the roles of sets and functions; then in 1.B
with the other meaning (STF) of standardness of ℕ, i.e. the interpretation of finiteness (applicable
to any universe as mentioned in 1.D).
The general idea is that the structures of U1 (the considered interpretation
of symbols of set theory in U1 to make it a universe) must be the restrictions
of those used for U2. But it depends which structures we mean, and how we
define their restrictions. From the property of preservation by restriction from U2
to U1 we shall specify for each version, more preservation properties on
other symbols implicitly follow, thanks
to the axioms of set theory in U2. Any structure defined using only symbols
with preserved meaning, is also preserved. The structures not preserved are anyway
determined in U1 as defined from those of U2
with parameter U1 (as a set or a unary predicate).
This is why U1 only needs to be specified as a meta-subset or class in
U2, keeping its structures implicit.
Let us focus on specifying what each version of standardness requires on sets (to which
the diversity of versions is reducible, as the conditions
on functions follow in the way essentially given by the fate of their graphs as sets, after
naturally defining standardness for ordered pairs). The first structure for them is
their class Set. The condition for this is that Set1 = Set2 ∩
U1. Yet, arguing that Set is not really a structure but only a
source of definiteness classes for effective structures, this condition may be weakened as
Set1 ⊂ Set2 ∩ U1 without
essentially affecting the main 3 degrees of standardness described below.
The
weak version (coming when seeing the role of sets as given by ∈) is
(ST) : ∈1 = ∈2 ∩ (U1× Set1).
Actually, this condition will suffice to imply (STF) once adopted the axiom of foundation
for U2 and further reasonable conditions which will be detailed later.
Otherwise we need to add it as another requirement in the concept of standardness, in
the form ℕ1 = ℕ2 (if the axiom of infinity is
adopted in both) or anyway in terms of the class Fin of finite sets as Fin1 ⊂
Fin2 which is equivalent to Fin1 = Fin2 ∩ Set1
(for reasons a bit subtle).
The intermediate version (mentioned in 1.7: every set coincides with the meta-set of the
same elements) is
(ST') : ∈1 = ∈2 ∩
(U2× Set1) ⊂ U1× Set1
i.e. every set in U1 is interpreted by U2 as
included in U1. This implies the preservation of bounded quantifiers
(if the sub-formula keeps its values, namely does not involve ℘) and of any operator
giving a set determined by the data of its elements, namely the operators given by the set
generation principle. Special constructions exist (such as those of internal set
theory) of sub-universes not fitting this condition and still preserving bounded quantifiers
on formulas with parameters in U1; but the equivalence holds with
the faithfulness of bounded quantifiers on formulas with parameters
in U2 (as we shall see when justifying the set
generation principle in 2.B).
The strongest version of standardness also requires the preservation of the powerset symbol ℘ (2.7):
(STP) : ℘1 = ℘2|Set1
This additional condition can also be written without explicit use of ℘ as
(ST") : (ST) ∧ ∀E∈Set1, ∀F∈Set2,
F ⊂2 E ⇒ F ∈ Set1
These conditions are related by (ST") ⇔ ((ST') ∧ (STP)) ⇒ (STF).
Indeed, (STF) comes by the fact finiteness is
definable from ℘ (independently of any additional assumption which was needed with the mere
(ST)) ; this can also be seen considering that arithmetic is specified
as a second-order theory,
which can be faithfully interpreted by set theory with ℘.
The independence of the axiom of choice (further commented in 5.5) and other statements
involving ℘, can be roughly explained by the possibility to construct universes
which do not preserve the essential meaning of ℘. Following convenience, one such
construction happens to fit (ST'); others may fit (ST) ∧ (STP) but not (ST') while
(ST')-standard copies of their results also exist.
Ideally standard universes
For any 3 universes U1 ⊂ U2 ⊂ U3
where U2 is standard in U3,
(U1 is standard in U2) ⇔ (U1
is standard in U3).
Thus the idea to consider the standardness of U1 as an absolute
property, independent of the external universe in which it is described... provided that
this external universe is itself standard. This does not define standardness as an
absolute concept (which cannot be defined anyway), but suggests that such
a concept may make ideal sense.
A sub-universe U1 of U2 will be qualified as small in
U2 if it is known as a set there. This can be written U1
∈ Set2 if U2 is also itself (ST')-standard, otherwise
∃X∈Set2, U1 = ∈⃖2(X).
To complete the concept of ideally standard universe, let us postulate that every two
standard universes are small sub-universes of a third one.
This ideal is quite reasonable for (ST) and (ST'); but for (ST") it is somewhat more daring.
As a first reason, it can be philosophically argued that any universe (standard or not) can be taken
as a small sub-universe of another in the sense (ST'), but not always in the sense (ST") because seeing
it as a set may provide means to define more subsets of given sets. In particular, any
(ST")-standard universe being a small (ST") sub-universe of another, implies its fulfillment of the
specification schema.
The realistic view of set theory
From now on we shall assume a fixed version of "standardness" meant as either (ST')
with a set theory without powerset, or (ST") with a set theory with powerset. In
either case, all bounded expressions keep the same values from a
standard universe to another, thus called their standard values.
The above concepts provide a realistic meaning for set theory :
- Generic theories, seen axiomatically, are meant to be interpreted in a fixed model
containing all values of variables and expressions;
- But set theory aims to "locally" interpret in the standard way any expression with
fixed values of its free variables, independently of the choice of a standard
universe, containing these values, where this interpretation may be processed.
In the next section (2.B) the meaningfulness of the (ST") ideal will appear philosophically
equivalent to the realism of admitting ℘ in set theory.
Let us formalize some aspects of the plurality of standard universes which set theory aims to describe
in its realistic view.
Standard multiverses (expansion of the set theoretical universe)
Let us call quasi-standard multiverse (resp. standard multiverse)
any collection (range) of universes (resp. standard universes), where any 2 of them are
small sub-universes of a third one. Let us require it to also contain all sub-universes
of its members, a quality easy to fulfill, taking a multiverse where it did not hold and
adding all these sub-universes to it.
We shall say that the set theoretical universe expands when it
ranges over a standard multiverse. For any quasi-standard multiverse
M, its union
U = ⋃M forms another universe containing all members of M
as small sub-universes. Then, for any expression with any values of its free variables in
U there exists some U∈M containing these values and thus able
to interpret the expression; this interpretation being independent of the chosen U,
defines that of U. Therefore,
(M is a standard multiverse) ⇔ (the universe U is standard).
But U cannot belong to M, because M has no greatest element. So,
no single standard multiverse can ever contain all possible standard
universes.
For a (ST")-quasi-standard multiverse, keeping in U the same
powerset of a given set as in member universes, may lead to the failure of the specification
schema in U.
Here comes another side of the postulate attached to the ideal (ST")-standardness :
that with (ST")-standard multiverses, this failure never happens.
Can a set contain itself ?
Let us call a set reflexive if it contains itself.
The proof
of Russell's paradox [1.8], shows that the class of non-reflexive sets
(Set(x) ∧ x ∉ x) cannot be a set (as it is not included in any set).
So, many sets are non-reflexive ; but can reflexive sets exist ? this is undecidable; here is why.
From any universe U with a nonempty class X of reflexive sets, another universe
containing none can be made by
2 different methods
- Take U\X as class of objects, and turn into urelements all sets E which
contained a reflexive set (∃x∈E, x∈x), and all functions whose
domain or image was such an E. (These sets cannot be rebuilt from the
data of their elements, since each one has an element removed from the universe,
namely itself ; any class of urelements can also be eliminated in the same way, turning
other objects into urelements).
- Progressively rebuild a universe from scratch: this avoids all reflexive sets. Namely, from a given
universe, first accept its urelements, then also accept, progressively along an endless
"time", every set only made of previously accepted objects, and every function only
involving such objects. Each set so accepted at some time, is formed as a collection of previously
accepted objects. At the first time a set is so accepted as a collection
of previously available objects, it could not be available yet as a possible element, thus it
cannot be reflexive.
As the reflexivity of a set is independent of context, a union of
universes each devoid of reflexive sets, cannot contain one either.
On the other hand, universes with reflexive sets can be produced as
follows:
Riddle. What is the difference between
- a universe with an urelement x and a non-reflexive set y
such that x ∈ y,
- and a universe with a reflexive set x and an urelement y
such that y ∉ x ?
Answer (click to show) :
The role of the set containing x but not y,
played by y in the former universe, is played by x
in the latter.
The axiom of foundation, introduced in 5.3
(well-foundedness), will imply the absence of reflexive sets as a particular case ; its
undecidability can be proven in ways essentially similar to the
above arguments. But for most purposes, this axiom is as useless as the sets it excludes.
2.B. Interpretation of classes
The relative sense of open quantifiers
As the set theoretical universe expands,
the values of decidable statements
(under the given axioms) must of course stay
constant. However, any undecidable statement, as well as any other formula containing
open quantifiers and given with fixed values of its free
variables (let us call these "open formulas"), may change value from one universe to another.
But if any value of an open formula can only be "the current one" relative to the choice of
this universe, the question whether it is constant or variable, that is how
things go «elsewhere», is relative to a choice of multiverse. But the realistic view of set theory means to
neither assume a fixed universe nor a fixed multiverse. Thus, the natural way for it is to
abstain from giving any value to open formulas (beyond decidable statements with known
proof or refutation). Indeed, tries to do better would be rather hopeless due to how messy
the situation is. Let us review its diverse aspects.
In a multiverse of «all universes» (standard or not) fitting a given
consistent axiomatic set theory, the situation was explained in Part 1 :
- The variability of
value of a statement there, is equivalent to its undecidability (completeness theorem);
- However, this undecidability, when true, cannot anyway be proven by the
same axiomatic set theory (incompleteness theorem).
The realistic view refering to standard universes, leads to another messy
situation, first by lack of formal definition for the ideal concept of standardness.
Trying to materialize the concept of "real indefiniteness" of statements as their variability
in a standard multiverse M, the behavior of this M is determined
by its union U, as M is the range of all small sub-universes of U;
this allows to express relevant questions as statements in U.
For example, the variability in M of an existential
statement (∃x, A(x) where A is a bounded formula)
means there exists in M a universe U where it is false (∀x∈U,
¬A(x)), and another universe U' where it is true (∃x∈U',
A(x)). But then it is also true in any standard universe containing both U and
U', among which some other members of M, and U itself. Such a
statement may be called «ultimately true».
But since some statements may vary between acceptable standard universes
U, some of these
may still vary between the U of acceptable standard multiverses.
Indeed, U is just another universe which can be axiomatically described,
but only incompletely, for both formal reasons (the incompleteness theorem) and
realistic ones (there is no ultimate standard multiverse).
Just more statements can be made decidable for U than for U
by describing U with a stronger theory T than the theory T
given to describe U. Such a stronger T can already be obtained by
formalizing this description of U as the union of a quasi-standard multiverse
whose members satisfy T (while the standardness of U escapes formalization).
Behind the undecidability of a statement (existential or other), multiple situations
may occur. It might (I guess) be always true in one quasi-standard multiverse, and
always false in another, while, depending on the statement, either of them
(but not both) may be standard. It may vary or not vary between standard universes,
but this variability, itself expressible as a statement in the union of a standard
multiverse, may be itself undecidable, and eventually vary between standard multiverses.
The undecidability of some statements such as AC, usually reflects the variability,
which may also happen among (ST')-standard universes, of the values of the open
quantifiers used in these statements when
re-written
without ℘. In particular, a quantifier ∀x∈℘(E) appears
as an open quantifier on the class of subsets of
E, while other uses of ℘ either also involve this once translated, or cannot even
be translated when the powerset axiom does not hold.
The indefiniteness of classes
Unlike objects which can be compared in formulas by the = symbol,
the meta-relation of equality between classes is as
indefinite as the open ∀ since both concepts are definable from each other :
- The equality of classes A = B would be
defined by ∀x, A(x) ⇔ B(x);
- The statement (∀x, A(x))
means A = 1 (the universe, class of all objects).
This indefiniteness can be understood by remembering that, depite being
usable as ranges, classes are only meta-objects, basically given as predicates,
i.e. synctactically (as a formula with parameters).
Like with open quantifiers, this leaves us with both concepts
of provable equality (or proven equality), and provable
inequality, according to the status (provable or refutable) of this
statement of "equality" of classes
(∀x, A(x) ⇔ B(x)).
Each universe U interprets each class C as a meta-set of
objects P = {x∈U | C(x)}, and sees it
as a set when P ∈ U. This condition is expressible by set theory in
U as a statement S(C), written equivalently as
∃P, ∀x, C(x) ⇔ x ∈ P
∃E, ∀x, C(x) ⇒ x ∈ E
The equivalence is because
P can be defined from such an E by P = {x∈E | C(x)}.
Having two open quantifiers, it is "more indefinite" than the equality between classes.
Classes and sets in expanding universes
Aside this formulation S(C) in a fixed universe, let us analyze this
distinction of sets among classes (and thus what makes other classes indefinite),
from the perspective of a quasi-standard multiverse.
There, this concept has 3 possible versions, ordered only approximately by implications which
"often work" but have exceptions.
- The "weak" version is to claim S(C) true in all members of this multiverse.
That means C always coincides with some set P, but this may be a different
P from one universe to another.
- The "strong" version requires P to stay constant. This can be seen as the
meaning behind the formal choice of naming P by a symbol of set theory, as
was discussed with the powerset (2.7). But just the constancy of P =
{x∈U | C(x)} when U varies, needs not imply the above
weak version, as there can exist some universe U (among the smallest of this
multiverse) such that P ∉ U. This means P is arising to existence
with U, and will anyway exist as a set in all the "future" (larger) universes which see
U as a set. So, a class C is not considered a set in this "strong"
sense, if it remains able to contain «unknown» or «not yet existing» objects (in some future
universe), that would only belong to some larger future value of P, which can thus vary
with the growth of U.
- The middle version (implied by the strong one) is to interpret S(C) in the
union U of the multiverse. Deciphered there, it says that, while U expands,
«there exists a time from which P stays constant» (and thus after which it will exist
as a set). So, this just differs
from the strong version by ignoring any "past" variations (which could occur during some
"beginning" of the expansion), to focus on the ultimate behavior (among the latest, i.e. largest
universes, somehow "closest" to the U where S(C) is interpreted).
But the ultimate realistic meaning of set theory is to refer to the range of "all standard
universes", which differs from the concept of standard multiverse in that this range is not a set
but a class. So, the concept of union cannot apply to this range, and yet its impossible result
would form the ideal interpretation of set theory. This ideal can be well approximated by
interpretations of set theory by either fixed universes or quasi-standard multiverses
(whose existence is ensured by the completeness theorem). Both perspectives, in terms of a
fixed vs. a variable universe, alternatively transcend each other endlessly along its expansion.
This explains how any intended set theory can be formalized by mere first-order axioms:
if any intended property of the universe was only expressible by a second-order statement,
or if anyhow its expression involved
external objects (regarding this universe as a set), then it could be re-expressed
by moving the framework, as the stronger first-order axiom of existence of
a sub-universe of this kind, and why not also endlessly many of them, forming a
standard multiverse (stating that every object is
contained in such a sub-universe).
Let B be a quantifier defined by a bounded formula whose parameters take values
in a sub-universe U of any other universe we might work in.
Let E be the range of all values taken by the argument y of A when
interpreting (By, A(y)). It is independent of A and
included in U insofar as A is absent from any expression of y in
the defining formula. Typical exceptions would be when B involves a
formula of the form A(t({x∈F|A(x)}))
for some term t, in which case defining E as the set of possible
values of y would require to assume ℘ and the (ST")-standardness of U.
Without these assumptions, such exceptional formulas need to be excluded from the definitions
of B for the justifications to work. Anyway they are excluded by understanding,
as explained in 2.A-2.B, the set-builder
as the notation for the operator symbols from
an infinite list, one for each formula using the language of set theory to which A
does not belong. (The set-builder is needed to put a formula inside a term, unless the
conditional operator is accepted as primitive, which is acceptable, but then, function
definers using A by the conditional operator are excluded for the same reason)
Let C(x)
defined as (By, y = x).
By the hypothesis of the set generation principle for the B which was written Q*,
we have a proof of (B ⇔ ∃C). This implies ¬(By, 0),
and, coming by second-order universal introduction, remains valid in any
universe where it is interpreted.
For any x, the value C(x)
of B on the predicate (y ↦ (y = x)),
can only differ (be true) from its (false) value on (y ↦ 0),
if both predicates differ inside E, i.e. if x
belongs to E : C(x) ⇔
((By, y = x) ⇎ (By, 0))
⇒ (∃y∈E, y = x ⇎ 0) ⇔ x
∈ E
This inclusion of C in E shows it is essentially a set:
- C is a set in the above "strong" sense (C ⊂ U) ;
- Moreover E is a set in the
sense of first-order logic (1.D): the formula of B
only has definite, fixed means (variables bound to given sets,
fixed parameters) to provide its elements. ∎
For any class satisfying the condition of the set generation principle
(being indirectly as usable as a set in the role of domain of
quantifiers), is it also indirectly as usable as a set in the role
of domains of functions (before using this principle) ? Namely, is
there for each such class a fixed formalization (bounded formulas
with limited complexity) playing the roles of the definer and
the evaluator for functions having this class as domains ? The
answer would be yes, but we shall not develop the justifications here.
Concrete examples
A set: Is there any dodo left on Mauritius ? As this
island is well known and regularly visited since their supposed
disappearance, no surviving dodos could still have gone unnoticed,
wherever they may hide. Having not found any, we can conclude
there are none. This question, expressed by a bounded quantifier,
has an effective sense and an observable answer.
A set resembling a class: Bertrand Russell raised this
argument about theology: «If I were to suggest that between
the Earth and Mars there is a china teapot revolving about the
sun..., nobody would be able to disprove my assertion [as] the
teapot is too small to be revealed even by our most powerful
telescopes. But if I were to go on to say that, since my assertion
cannot be disproved, it is intolerable presumption on the part of
human reason to doubt it, I should rightly be thought to be
talking nonsense.» The question is clear, but on a too large space,
making the answer practically inaccessible. (An 8m telescope has
a resolution power of 0.1 arcsec, that is 200m on the moon surface)
A class: the extended statement, «there is a teapot orbiting
some star in the universe» loses all meaning: not only the size of
the universe is unknown, but Relativity theory
sees the remote events from which we did not receive light yet, as
not having really happened for us yet either.
A meta-object: how could God «exist», if He is a
meta-object, while «existence» can only qualify
objects? Did apologists properly conceive their own thesis
on God's «existence» ? But what are the objects
of their faith and their worship ? Each monotheism rightly
accuses each other of only worshiping objects (sin of idolatry):
books, stories, beliefs, teachings, ideas, attitudes, feelings,
places, events, miracles, healings, errors, sufferings, diseases,
accidents, natural disasters (declared God's Will), hardly more
subtle than old statues, not seriously checking (by fear of God)
any hints of their supposed divinity.
A universal event: the redemptive sacrifice of the Son of
God. Whether it would have been theologically equivalent for it to
have taken place not on Earth but in another galaxy or in God's
plans for the Earth in year 3,456, remains unclear.
Another set reduced to a class... The class F of
girls remains incompletely represented by sets: the set of those
present at that place and day, those using this dating site and
whose parameters meet such and such criteria, etc. Consider the
predicates B of beauty in my taste, and C of
suitability of a relationship with me. When I try to explain that
«I can hardly find any pretty girl in my taste (and they are often
unavailable anyway)», i.e.
(∀F x, C(x)
⇒ B(x)) ∧ {x∈ F | B(x)}≈Ø,
the common reaction is: «Do you think that beauty is the only thing
that matters ?», i.e.
What,(∀x ∈ F, C(x)
⇔ B(x)) ????
then «If you find a pretty girl but stupid or with bad character,
what will you do ?». Formally : (∃x ∈ F, B(x)
⇏ C(x) !!!). And to conclude with a claim of pure
goodness: «I am sure you will find !», that is (∃ plenty of x ∈ F,
C(x)). Not forgetting the necessary
condition to achieve this: «You must change your way of thinking».
... by the absence of God...: F would have immediately
turned into a set by the existence of anybody on Earth able to
receive a message from God, as He would obviously have used this
chance to make him email me the address my future wife (or the other
way round).
... and of any substitute: a free, open and efficient online
dating system, as would be included in my project trust-forum.net, could give the
same result. But this requires finding programmers willing to
implement it. But the class of programmers is not a set either,
especially as the purpose of the project would conflict with the
religious moral priority of saving God's job from competitors
so as to preserve His salary of praise.
2.C. Concepts of truth in mathematics
Let us recall and expand on the different concepts of «truth» we saw for mathematical statements,
from the simplest to the most subtle.
The most basic was the relative truth (further analyzed in 1.B), that is
the Boolean value of any statement expressed in
any given theory, depending on the system (model of the theory) where it is interpreted;
this system is supposedly fixed like an implicit free variable.
More generally, any formula takes a Boolean value for fixed
values of its free variables in that system.
Then comes provability
in any given first-order axiomatic theory ; this coincides with the quality of being
relatively true in all its models.
Finally, come the truths in two realistic theories : arithmetic and set theory. Let us review them
in more details.
Arithmetic truths
As explained in 1.C, the
concept of provability in any explicit first-order theory, and more generally any effective
"provability" concept (predicate) over the class of statements of any conceivable theory, must
be a FOT-existential
predicate, equivalent (through some encoding) with some
existential predicate of arithmetic. But we need to refer to the realistic truth of
arithmetic to interpret these. This is an ideal but clear concept (independent of
any ontological hypothesis about infinities), meaning the class of properties of "the
standard model ℕ of arithmetic" (to say in short, as all standard models of arithmetic
are identical copies of each other).
Indeed all standard objects of a FOT being finite can be accepted as real (no matter how big),
while their distinction with non-standard objects (pseudo-finite ones) is also real despite being
not formalizable.
The meaningfulness of statements can be understood first for bounded formulas, then those
with 1 open quantifier, such as provability or consistency statements, whose objects are mere
finite systems (proofs). And so on for each additional open quantifier over a previously accepted
formula, even if their meaningfulness becomes more and more subtle.
Yet the incompleteness theorem showed that this realistic truth predicate of arithmetic
is not itself an existential predicate.
More precisely, even the negation of the provability predicate of any explicit theory
able to express arithmetic, cannot be itself an existential predicate.
Still we need partial solutions: existential classes of statements of arithmetic with
both qualities
- FOT-sound (included in the class of realistic truths);
- Large (compared by inclusion with other such existential classes).
The natural way to progress in the endless and non-algorithmic quest for larger and larger
such classes without breaking FOT-soundness, consists in searching for stronger
and stronger axiomatic set theories. Let us comment this further.
More concepts of strength
Any axiomatic set theory (or other foundational theory) T is reflected by
2 classes of arithmetical statements :
- The class of its arithmetical theorems ;
- The statement of its consistency.
Both should not be confused :
- 1. cannot contain 2. (unless 2. is false), though 2. must be true if 1. is valid. This is because
1. is only made of the theorems of T with a standard proof, while 2. means to forbid the current
universe from containing even a non-standard proof (with pseudo-finite length) from T that would
lead to contradiction.
- So, 2. means the existence, in the current universe, of some model of T but with no
requirement of standardness ; while 1. means the current universe agrees with T on arithmetic,
and thus can only be philosophically justified by the idea that T is valid in the sense of having a
FOT-standard model.
They lead to two other conceptions (definitions) of a "strength" preorder between foundational
theories, most often equivalent to our first one of 1.A (possible cases of
non-equivalence will not be considered here) :
- The inclusion order between their classes of arithmetical theorems ;
- The deducibility (provable implication) between consistency statements:
T' is stronger than T if the consistency of T is deducible
from the consistency of T'; then T' is "strictly stronger" than
T if the consistency of T is a theorem of T'.
(1. would be related to the implication order between statements of FOT-soundness, except that
the expression of such statements requires a framework strong enough to express
arithmetical truth, such as MT).
If T' can prove the existence of a standard model of T
then T' is strictly stronger than T in both ways.
Set theory from realism to axiomatization
For an axiomatic set theory to give a class of arithmetical theorems both
FOT-sound and large, it needs to be "not wrong and yet very good" :
- Sound = keeps accepting some standard universes (for FOT-soundness);
- Strong = only accepts "large" universes (truly large in the standard
case, otherwise anyway looking large by the description of their hierarchy of
sub-universes), to contribute to a large class of theorems.
Among sound, well-described set theories (or other foundational theories) there can be no
strongest one : from any of them T we can get stronger ones, at least arithmetically, in
the following ways roughly ordered by increasing power (where T0 is some possibly
weaker theory but which fits the below "open" quality):
- T + the consistency of T;
- T0 + the existence of a standard universe of T ;
- T0 + the existence of infinitely many standard universes of T
forming a standard multiverse (actually only ensured to be quasi-standard).
And these are only the first of an open-ended range of possible methods with growing
efficiency in their strengthening effect with respect to the added complexity of description.
It actually turns out that the Replacement schema, used in ZF set theory,
amounts to the use of a much more powerful
strengthening method than these.
Arguments to justify any strong axiomatic set theory like this as a valid foundation of
mathematics, must remain somewhat philosophical, and thus assessed in intuitive,
not completely formalizable ways, since their point is to do better than any formal proof
or other predefined algorithm: no fixed formal method can always witness for any strong
theory when it is consistent, even less when it is FOT-sound (by truth undefinability), or
sound (the idea that some given axiom which excludes one size of standard universe still accepts
some larger ones).
The usefulness of strong axiomatic set theories for proving large classes
of arithmetical truths (insofar as philosophical disputes on the ontological status
of the ideals motivating these theories do not undermine the role of these
ideals as reasons for the confidence in the FOT-soundness), can then be
read as an indispensability argument for the reality of the universes so described,
beyond the infinity of ℕ.
Indeed, it makes these theories "valuable", while their consistency together
with the mere existence of ℕ, ensure the existence of models. Only non-standard
models are so ensured to exist, but these work similarly to the standard ones
which were ideally meant anyway.
Hence our last concept of truth, which is the truth of set theoretical statements.
Axioms compatibility condition
Given several axiomatic set theories with both qualities (sound and strong),
consider the theory obtained as their union (the union of their
classes of axioms if they have the same symbols, for example if all only use ∈, like
ZF). It will be stronger than each, but is it still sound ?
Let us qualify a collection of set theories (or their axioms) as compatible
if their union (conjunction) remains sound.
So, beyond the soundness and strength of axiomatic set theories, we need an
additional quality requirement for their axioms, so that, without limiting the
strength of sound set theories which can be so written, all such theories will be compatible.
Actually this problem has a natural solution. This needed additional quality is another
ideal concept, not completely formalizable, so let us first express it intuitively :
- An axiom is open if satisfied by any open universe, i.e. where eternity
is a very long time especially towards the end.
This is meant as more specific than the mere "eternity is a very long time" which rephrases
the quality "large" for universes, and as deepening our previous concept of being open. Let us explain it in several steps.
First, no axiom should put any limit on the size
of the universe, for compatibility with set theories stronger than this size (for example, a
sound theory T accepting only one standard universe would be incompatible with
the statement of existence of a standard universe of T).
Then, the remaining risk for several statements to be incompatible, is if the values
of at least 2 of them endlessly vary (alternating truth and falsity) as the universe expands,
in such a way that no standard universe "larger than some size" can fit all (their
conjunction would limit the size of the universe). This is resolved by only accepting "open"
axioms, which will all agree on "open" standard universes. Let us explain how.
Consider such a variable statement : written in prenex form it must be using both
kinds of open quantifiers. Let us analyze the case of statements with only 2 open
quantifiers: ∃x, ∀y, A(x,y), among which the statements
S(C) that some
class is a set. (From this, the case ∀x, ∃y is deduced by negation, while
cases with more open quantifiers would be trickier and will not be discussed here). For any
standard multiverse where it indeed endlessly varies, such a statement turns out to be false
in its union. So, taking such a statement as an axiom would be inappropriate for this multiverse.
Yet, what really matters is the behavior of the class of all standard universes (there
is no systematic way to determine this, but it is our intended ideal). Now the point of a
statement being "open", is that it reflects the truth in the union of a multiverse which
behaves like the intended class of all standard universes.
Such "open" universes, with the same properties as the union of a standard
multiverse behaving like the class of "all standard universes", can be intuitively
described as "much larger than any smaller one" : there must be no limit to how
bigger the universe is from any sub-universe (described by a weaker theory).
This also avoids the inefficiency
that could be involved in the opposite case, of an axiomatic system likely
to be significantly more complex but only slightly stronger than another one.
The quest for axiomatic set theories approaching all 3 ideally conceived qualities
(strong and open but still sound) aiming to select universes with the
corresponding 3 qualities (large and open but still existing among standard ones),
is endless. Fortunately, rather simple set theories such as ZF, already satisfy these
qualities to a high degree, describing much larger realities than
usually needed. This is how a Platonic view of set theory (seeing the
universe of all mathematical objects as a fixed, exhaustive reality) can
work as a good approximation, though it cannot be an exact, absolute fact.
Alternative logical frameworks
The description we made of the foundations of mathematics
(first-order logic and set theory), is essentially just an equivalent clarified
expression of (or other way to introduce) the same version of mathematics as
widely accepted. Other logical
frameworks already mentioned, to be developed later, are still in the "same
family" of "classical mathematics". But other, more radically different frameworks
(concepts of logic and/or sets), called non-classical logic, might be considered.
Examples:
- Some logicians developed the «intuitionistic logic», which
lets formulas keep a possible indefiniteness as we mentioned for
open quantifiers, but treated as a modification of the pure
Boolean logic (the rejection of the excluded middle, where ¬(¬A)
does not imply A), without any special mention of
quantifiers as the source of this uncertainty. Or it might be
seen as a formal confusion between truth and provability. In
this framework, {0}∪ ]0,1] ⊂ [0,1], without equality. I could
not personally find any interest in this formalism but only
heard that theoretical computer scientists found it useful.
- When studying measure
theory (which mathematically defines probabilities in
infinite sets), I was inspired to interpret its results as
simpler statements on another concept of set, with the following
intuitive property. Let x be a variable randomly
taken in [0,1], by successively flipping a coin for each of its
(infinity of) binary digits. Let E be the domain of x,
set of all random numbers in [0,1]. It is nonempty
because such random numbers can be produced. Now another
similarly random number, with the same range (y ∈ E)
but produced independently of x, does not have any
chance to be equal to x. Thus, ∀x∈E, ∀y∈E,
x ≠ y. This way, x ∈ E
is no more always equivalent to ∃y∈E, x = y.
We will ignore such alternatives in the rest of this work.