1. First foundations of mathematics
1.1. Introduction to the foundations of mathematics
Mathematics and theories
Mathematics is the study of systems of elementary objects, whose only considered
nature is to be exact, unambiguous (two objects are equal or different, related or not;
an operation gives an exact result...). Such systems are conceived independently
of our usual world, even if many of them can resemble (thus be used to describe)
diverse aspects of it. Mathematics as a whole can be seen as «the science
of all possible worlds» of this kind (of exact objects).
Mathematics is split into diverse branches, implicit or explicit frameworks of any
mathematical work, which may be formalized as (axiomatic) theories. Each
theory is the study of a supposedly fixed system that is its world of
objects, called its model. But each model of a theory may be just one of its
possible interpretations, among other equally legitimate models. For example,
roughly speaking, all sheets of paper are systems of material points, models
of the same theory of Euclidean plane geometry, but independent of each other.
Foundations and developments
The content of a theory, describing its model(s), is made of
components which are pieces of description (concepts and information,
described in 1.3). A theory starts with a choice of foundation made of a
logical framework and an initial version of its content (hopefully rather small, or at
least simply describable). The components of this initial version are qualified as
primitive.
The study of the theory progresses by choosing some of its possible
developments : new components resulting from its current content, and
that can be added to it to form its next content. These different contents, having
the same meaning (describing the essentially same models), play the role of
different presentations of the same theory.
Any other possible development (not yet chosen) can still be added later, as the
part of the foundation that could generate it remains. Thus, the totality of
possible developments of a theory, independent of the order chosen to process
them, already forms a kind of «reality» that these developments explore.
To express the properties of its models, the content of a theory includes a list of
statements, which are formulas meant as true when interpreted in any model.
Primitive statements are called axioms. Further statements called theorems
are added by development to the content, under the condition that they are proven
(deduced) from previous ones : this ensures them to be true in all models,
provided that previous ones were. Theorems can then be used in further
developments in the same way as axioms. Other kinds of developments
(definitions and constructions) which add other components beyond statements,
will be described in 4.8 and 4.9.
A theory is consistent if its theorems will never contradict each other.
Inconsistent theories cannot have any model, as the same statement cannot be
true and false on the same system.
The Completeness
theorem (4.6) will show that the
range of all possible theorems precisely reflects the more interesting reality
of the diversity of models, which indeed exist for any consistent theory.
There are possible hierarchies between theories, where some can
play a foundational role for others. For instance, the foundations of
several theories may have a common part forming a simpler theory,
whose developments are applicable to all.
A fundamental work
is to develop, from a simple initial basis, a convenient body of knowledge
to serve as a more complete "foundation", endowed with efficient tools
opening more direct ways to further interesting developments.
The cycle of foundations
Despite the simplicity of nature of mathematical objects, the general
foundation of all mathematics turns out to be quite complex
(though not as bad as a physics theory of everything). Indeed,
it is itself a mathematical study, thus a branch of mathematics,
called mathematical logic.
Like any other branch, it is made of definitions and theorems
about systems of objects. But as its object is the general form
of theories and systems they may describe, it provides the general
framework of all branches of mathematics... including itself.
And to provide the framework or foundation of each considered
foundation (unlike ordinary mathematical works that go forward from
an assumed foundation), it does not look like a precise starting point,
but a sort of wide cycle composed of easier and harder steps. Still this
cycle of foundations truly plays a foundational role for mathematics, providing
rigorous frameworks and many useful concepts to diverse branches of
mathematics (tools, inspirations and answers to diverse philosophical questions).
(This is similar to dictionaries defining each word by other words, or to
another science of finite systems: computer programming. Indeed computers
can be simply used, knowing what you do but not why it works; their working
is based on software that was written in some language, then compiled by
other software, and on the hardware and processor whose design and production
were computer assisted. And this is much better than at the birth of this field.)
It is dominated by two theories:

Set theory describes the universe of «all mathematical
objects», from the simplest to the most complex such as infinite
systems (in a finite language). It can roughly be seen as one theory,
but in details it will have an endless diversity of possible variants
(indeed differing from each other).

Model theory is the study of theories (their formalisms as systems
of symbols), and systems (possible models of theories). Proof theory
completes this by describing formal systems of rules of proofs. While these
are usually meant as general topics (admitting variants of concepts), the
combination of both can be specified into precise versions (mathematical
theories) called logical frameworks, each giving a precise format of
expression for a wide range of possible theories, and a format in which all
proofs in any of these theories can in principle be expressed. There is an
essentially unique preferable logical framework called firstorder logic,
by which the concepts of theory, theorem (as provable statement) and
consistency of each theory, find their natural mathematical definitions.
Each one is the natural framework to formalize the other: each set theory is formalized
as a theory described by model theory; the latter better comes as a development from
set theory (defining theories and systems as complex objects) than directly as a theory.
Both connections must be considered separately: both roles of set theory, as a
basis and an object of study for model theory, must be distinguished. But these
formalizations will take a long work to complete.
Starting mathematics is a matter of introducing some simple concepts from
the founding cycle, which may seem as selfsufficient as possible (while they
cannot be absolutely so). A usual and natural solution is to start with a set theory
not fully formalized as an axiomatic theory. 1.2 will do this very shortly,
intuitively explaining the concepts of set and function. Then
1.3 will start introducing the main picture of foundations (model theory) by
which set theory can be formalized, with its main subtleties (paradoxes).
1.2. Variables, sets, functions and operations
Constants
A constant symbol is a symbol seen as denoting a unique object,
called its value. Examples: 3, ⌀, ℕ. Those of English
language usually take the form of proper names and names with «the»
(singular without complement).
Free and bound variables
A variable symbol (or a variable), is a symbol which, instead
having an a priori definite value, comes with the concept of possible
values, or possible interpretations as taking a particular value. Each possibility
gives it a role of constant. These possible values may as well
be infinitely many, or only one or even none.
It can be understood as limited by a box, whose inside has multiple
versions in parallel, articulating different viewpoints over it:
 The variable is called fixed when seen "from inside", which means it has a given value,
and is thus usable as a constant.
 It is called bound when seen from
the «outside» where the diversity of its possible values is
considered fully known, gathered and
processed as a whole.
 It is called free to describe a coexistence of both statuses (views over it):
a local view seeing it as fixed, and an external view giving the context of its variations.
More precisely with respect to given theories, fixing a variable
means taking a free variable in a theory and more lengthily ignoring its
variability, therefore simulating the use of the other theory obtained by holding
this symbol as a constant.
The diverse «internal viewpoints», corresponding to each possible
value seen as fixed, may be thought of as abstract «locations» in the
mathematical universe, while the succession of views over a symbol
(qualifying it as a constant, a free variable or a bound variable), can be
seen as a first expression of the flow of time in mathematics: a
variable is bound when all the diverse "parallel locations inside
the box" (possible values) are past. All these places and
times are themselves purely abstract, mathematical entities.
Ranges and sets
The range of a variable, is the meaning it takes when seen
as bound: it is the «knowledge» of its considered totality of possible or
authorized values (seen in bulk: unordered, ignoring their context),
that are called the elements of this range. This «knowledge» is
an abstract entity that can actually process (encompass) infinities of
objects, unlike human thought. Any range of a variable is called a set.
A variable has a range when it can be bound, i.e. when an encompassing view over all its
possible values is given. Not all variables of set theory will have a range. A variable without
a range can still be free, which is no more an intermediate status between free and bound,
but means it can take some values or some other values with no claim of exhausitivity.
Cantor defined a set as a «gathering M of definite and separate
objects of our intuition or our thought (which are called the "elements" of M) into a whole».
He explained to Dedekind : «If the totality of elements of a multiplicity can be
thought of... as "existing together", so that they can be gathered
into "one thing", I call it a consistent multiplicity or a "set".» (We expressed this
"multiplicity" as that of values of a variable).
He described the opposite case as an «inconsistent multiplicity»
where «admitting a coexistence of all its elements leads to a
contradiction». But noncontradiction cannot suffice to
generally define sets: the consistency of a statement does not
imply its truth (i.e. its negation may be true but unprovable);
facts of noncontradiction are often themselves unprovable (incompleteness theorem);
and two separately consistent coexistences might contradict each
other (Irresistible
force paradox / Omnipotence
paradox).
A variable is said to range over a set, when it is bound
with this set as its range. Any number of variables can be
introduced ranging over a given set, independently of each other and of
other variables.
Systematically renaming a bound variable in all its box, into
another symbol not used in the same context (same box), with the
same range, does not change the meaning of the whole. In practice,
the same letter can represent several separate bound variables (with
separate boxes), that can take different values without conflict, as
no two of them are anywhere free together to compare their values.
The common language does this continuously, using very few variable
symbols («he», «she», «it»...)
Functions
A function is an object f made of the following data:
 A set called the domain of f, denoted Dom
f
 For each element x of Dom f,
an object written f(x), called the image of x
by f or value of f at x.
In other words, it is an entity behaving as a variable whose value
is determined by that of another variable called its argument with
range Dom f : whenever its argument is fixed (gets a name, here
"x", and a value in Dom f), f becomes also fixed, written
f(x). This actually amounts to conceiving a variable f where
the "possible views" on it as fixed, are treated as objects x conceptually
distinct from the resulting values of f.
As we shall see later, such an entity (dependent variable) f would not be
(viewable as) a definite object of set theory if its argument had no range, i.e.
could not be bound (it would only be a metaobject, or object of model theory,
that we shall call a functor in 1.4)
Operations
The notion of operation generalizes that of function, by
admitting a finite list of arguments (variables with given
respective ranges) instead of one. So, an operation gives a result (a value)
when all its arguments are fixed. The number n of arguments
of an operation is called its arity ; the operation is
called nary. It is called nullary if n=0
(it is a constant), unary if n=1 (it is a function),
binary if n=2, ternary if n=3...
Nullary operations are useless as their role is already played by their
unique value; 2.1 will show how to construct those with arity
> 1 by means of functions.
The value of a binary operation f on its fixed arguments
named x and y (i.e. its value when its arguments are
assigned the fixed values of x and y), is denoted
f(x,y).
Generally, instead of names, the arguments are pictured by places around
the symbol, namely the left and right spaces in parenthesis, to be filled by
any expression giving them desired values.
An urelement (pure element) is an object not playing any other role
than that of element: it is neither a set nor a function nor an operation.
1.3. Form of theories: notions, objects and metaobjects
The variability of the model
Each theory describes its model as a fixed system.
But from the larger viewpoint of model theory, this is a mere «choice» of one possible
model (interpretation) in a wide (usually infinite) range of other existing, equally legitimate
models of the same theory. Now this fixation of the model, like the fixation of any variable,
is but the elementary act of picking any possibility, ignoring any issue of how to specify
an example in this range. Actually these «choice» and «existence» of models can be quite
abstract. In details, the proof
of the Completeness theorem will effectively «specify» a model of any consistent
theory for the general case, but its construction will involve an infinity of steps, where
each step depends on an infinite knowledge. Regardless this difficulty, the attitude of
implicitly fixing a model when formally studying any mathematical theory, remains the
standard way of interpreting it (except somehow for set theory as explained in 1.D).
Notions and objects
Each theory has its own list of notions, usually designated
by common names, formally serving as the kinds of variables it can use
; each model interprets each notion as a set that is the common
range of all variables of this kind. For example, Euclidean geometry
has the notions of «point», «straight line», «circle» and more, and is usually
expressed using a different style of variable symbol for each. The
objects of a theory in a model, are all possible values
of its variables of all kinds (the elements of all its notions) in this model.
Onemodel theory
Any discussion on several theories T and systems M that may be
models of those T, takes place in model theory, with its
notions of «theory» and «system» that are the respective kinds of the variables
T and M. But when focusing on one theory with a fixed model, the
variables T and M now fixed disappear from the list of variables.
Their kinds, the notions of theory and model, disappear from the notions list too.
This reduces the framework, from model theory, to that of onemodel theory.
A model of onemodel theory, is a system
[T,M] which combines a theory T with a model
M of T.
The diversity of logical frameworks
The role of a logical
framework, as a precise version of (one)model theory with its associated
proof theory, is to describe : The admissible forms of contents for theories ;
 In particular, the syntactic structures of possible statements and other
expressions, which can be called their "grammar" ;
 The meaning of these contents and expressions on the models ;
 The rules of development of theories.
Here are those we shall see,
roughly ordered from the poorest to the most expressive
(though the order depends on the ways to relate them):
 Boolean algebra, also called propositional calculus (1.6);
 Algebra;
 Firstorder logic;
 Duality (for geometry) and the tensor formalism for linear algebra;
 Secondorder logic (5.1, 5.2);
 Higherorder logic (5.2);
 Set theory.
We shall first describe the main two of them in parallel. Firstorder logic is the
standard version of model theory, describing firstorder theories we shall
also call here generic theories. Set theory, which can encompass all
other theories, can also encompass logical frameworks and thus serve itself as
the ultimate logical framework as will be explained in 1.7.
Most frameworks manage notions as types
(usually in finite number for each theory) classifying both variables and
objects. Notions are called types if each object belongs to only one of them, which is then also
called the type of the variables that can name it. For example, an object of Euclidean
geometry may be either a point or a straight line, but the same
object cannot be both a point and a straight line. But set theory will need
more notions beyond types: classes, which will be introduced in 1.7.
Examples of notions from various theories
Theory 
Kinds of objects (notions) 
Generic theories 
Urelements classified by types to play different roles 
Set theory 
Elements, sets, functions, operations,
relations, tuples... 
Model theory 
Theories, systems and their components
(listed below) 
Onemodel theory

Objects, symbols, types or other notions, Booleans,
structures (operators, predicates), expressions (terms, formulas)... 
Arithmetic 
Natural numbers 
Linear Algebra 
Vectors, scalars... 
Geometry 
Points, straight lines, circles... 
Metaobjects
The notions of a onemodel theory T_{1}, normally
interpreted in [T,M], classify the components of T
(«type», «symbol», «formula»...), and those of M («object»,
and the means to interpret components and expressions of T there).
But the same notions (even if from a different logical framework) can be interpreted in
[T_{1}, [T,M]], by putting the prefix meta on them.
By its notion of «object», onemodel theory distinguishes the
objects of T in M among its own objects in [T,M],
that are the metaobjects. The above rule of use of the meta prefix
would let every object be a metaobject; but we will make a vocabulary
exception by only calling metaobject those which are not objects:
symbols, types or other notions, Booleans, structures, expressions...
Set theory only knows the ranges of some of its own variables, seen
as objects (sets). But, seen by onemodel theory, every variable of
a theory has a range among notions, which are metaobjects only.
Components of theories
In a given logical framework, the content of a theory consists in 3
lists of components of the following kinds, where those of each of the latter
two kinds are finite systems using those of the previous kind.
 A list of abstract types, names in the theory
that will designate types in each system;
 A language (vocabulary): list of structure
symbols, names of the structures that will form the described system by
relating objects with given types (1.4).
 A list of axioms chosen among expressible statements with
this language (1.5).
1.4. Structures of mathematical systems
The structures, interpreting the language of a theory, relate the objects
of diverse types, giving their roles to the objects of each type with respect to
those of other types, to form the studied system. According to these roles,
objects may be thought of as complex objects, in spite of have otherwise no
nature like urelements.
The kinds of structures (and thus the kinds of structure symbols)
allowed in firstorder theories, thus called firstorder structures,
will be classified into operators and predicates. We shall
describe them as operations designated by structure symbols in a set
theoretical interpretation. More powerful structures called secondorder
structures will be introduced in 5.1, coming from set theoretical tools or as
packs of an additional type with firstorder structures.
Settheoretical interpretations
Any generic theory can be interpreted (inserted, translated) in set theory
by converting its components into components of set theory. This is the usual
view of ordinary mathematics, studying many systems as «sets with relations or
operations such that...», with possible connections between these systems.
Let us introduce both the generic interpretations applicable to any generic
theory, and other ones usually preferred for particular theories.
Any interpretation converts each abstract type into a symbol (name) designating
a set called interpreted type (serving as the range of variables of that type,
whose use is otherwise left intact). This symbol is usually a fixed variable in the
generic case, but can be accepted as constant symbol of set theory in special
cases such as numbers systems (ℕ, ℝ...).
In generic interpretations, all objects (elements of interpreted types) are
urelements, but other kinds of interpretations called standard
by convention for specific theories may do otherwise.
For example, standard interpretations of geometry represent points by
urelements, but represent straight lines by sets of points.
Generic interpretations will also convert structure symbols into fixed
variables (while standard ones may use the language of set theory to
define them). Any choice of fixed values of all types and structure symbols,
defines a choice of model. Models become objects of set theory, owing their
multiplicity to the variability of types and structure symbols. This integrates
all needed theories into the same set theory, while gathering representatives
of all their considered models inside a common model of set theory. This is
why a model of set theory is called a universe. When adopting set
theory as our conceptual framework, this concept of "interpretation" becomes
synonymous with the choice (designation) of a model.
Firstorder structures
An operator is an operation between interpreted types.
On the side of the theory before interpretation, each operator symbol
comes with its symbol type made of
 its list of arguments (variable symbols figured as places
around the operator symbol instead of names),
 for each argument, its abstract type, whose value as a set will be the
range of this argument in any interpretation;
 its type of results, type which will contain all results of the operation
it will designate in any interpretation with given values of its arguments.
In a theory with only one type, this data is reduced to the arity.
The constant symbols (or constants) of a theory
are its nullary operator symbols (having no argument).
Unary operators (that are functions) will be called here functors
(*).
The list of types is completed by the Boolean type, interpreted as the set
of two elements we shall denote 1 for «true» and 0 for «false». A variable of this
type (outside the theory) is called a Boolean variable.
A paraoperator is a generalized operator allowing the
Boolean type among its types of arguments and results.
A (logical) connective is a paraoperator with only Boolean arguments
and values.
A predicate is a paraoperator with Boolean values, and at
least one argument but no Boolean argument.
As will be formalized in 2.4.,
any nary operator f may be reduced to the (n+1)ary predicate
(y = f(x_{1},...,x_{n})), true for a
unique value of y for any
chosen values of x_{1},...,x_{n}.
Structures of set theory
Formalizing set theory, means describing it as a theory with its notions,
structures and axioms. We shall admit 3 primitive notions : elements (all
objects), sets and functions. Their main primitive structures are introduced
below. Most other primitive symbols and axioms will be presented in 1.8, 1.10
and 1.11, in a dedicated logical framework, convertible into firstorder logic by a
procedure also described in 1.10. Still more primitive components will be needed
and added later (2.1, 2.4, 2.5, 4.3). Optional ones, such as the axiom of
choice (2.10), will open a diversity of possible set theories.
This view of set theory as described by (one)model theory, relates the terminologies
of both theories in a different way than when interpreting generic theories in set theory.
As the set theoretical notions (sets, functions...) need to keep their natural names when
defined by this formalization, it would become incorrect to keep that terminology
for their use in the sense of the previous link (where notions were "sets" and
operators were "operations"). To avoid confusion, let us here only use the model theoretical
notions as our conceptual framework, ignoring their set theoretical interpretations.
We shall describe in 1.7 how both links can be put together, and
how both ways to conceive the same theories (describing them by model
theory or using a set theoretical interpretation) can be reconciled.
One aspect of the role of sets is given by the binary predicate ∈ of belonging :
for any element x and any set E, we say that x is in E
(or x belongs to E, or x is an element of E, or
E contains x) and write x ∈ E, to mean that
x is a possible value of the variables with range E.
Functions f play their role by two operators: the domain
functor Dom, and the function evaluator, binary operator
that is implicit in the notation f(x), with arguments
f and x, giving the value of any function f
at any element x of Dom f.
About ZFC set theory
The ZermeloFraenkel set theory (ZF, or ZFC with the axiom of choice)
is a generic theory with only one type «set», one structure symbol ∈ ,
and axioms. It implicitly assumes that every object is a set, and thus a set
of sets and so on, built over the empty set.
As a rather simply expressible
but very powerful set theory for an enlarged founding cycle, it can be a good
choice indeed for specialists of mathematical logic to conveniently prove diverse
difficult foundational theorems, such as the unprovability of some statements,
while giving them a scope that is arguably among the best conceivable ones.
But despite the habit of authors of basic math courses to conceive their
presentation of set theory as a popularized or implicit version of ZF(C), it is
actually not an ideal reference for a start of mathematics for beginners:
 It cannot be selfcontained as it must assume the framework of model theory to
make sense.
 Its axioms, usually just admitted (as either intuitive,
obvious, necessary or just historically selected for their consistency and the
convenience of their consequences), would actually deserve more subtle and
complex justifications, which cannot find place at a starting point.
 Ordinary mathematics, using many objects usually not seen as sets,
are only inelegantly developed from this basis. As the roles of all needed
objects can anyway be indirectly played by sets, they did not require another
formalization, but remained cases of discrepancy between the
«theory» and the practice of mathematics.
The complexity and weirdness of these needed developments
do not disturb specialists just because once known possible, they
can simply be taken for granted.
Formalizing types and structures as objects of onemodel theory
To formalize onemodel theory through the use of the meta prefix,
both metanotions of "types" and "structures" are given their roles by
metastructures as follows.
Since onemodel theory assumes a fixed model, it only needs one
metatype of "types" to play both roles of abstracts types (in the theory)
and interpreted types (components of the model), respectively given by
two metafunctors: one from variables to types, and one from objects
to types. Indeed the more general notion of «set of objects» is not used
and can be ignored.
But the metanotion of structure will have to remain distinct from the language,
because more structures beyond those named in the language will be involved (1.5).
Structures will get their roles as operations, from metastructures similar to
the function evaluator (see 3.13.2 for clues), while the
language (set of structure symbols) will be interpreted there by a metafunctor from
structure symbols to structures.
However, this mere formalization would leave
undetermined the range of this notion of structure. Trying to conceive this range as that of
«all operations between interpreted types» would leave unknown the source
of knowledge of such a totality. This idea of totality will be formalized in set
theory as the powerset (2.5), but its meaning will still depend on the
universe where it is interpreted, far from our present concern for onemodel theory.
1.5. Expressions and definable structures
Terms and formulas
Given the first two layers of a theory (a list of types and a language), an
expression (that is either a term or a formula), is a finite system of occurrences
of symbols, where an occurrence of a symbol in an expression is a place where
that symbol is written (for example the term « x+x » has two occurrences
of x and one of +).
Each expression comes in the context of a given list of available free variables.
In expressions of firstorder theories and set theory, symbols of the following kinds may occur.
 Variables of each type:
 Free variables, from the list of available ones ;
 Bound variables, whose occurrences are contained by binders (see 1.8) ;
 Paraoperator
symbols:
 Structure symbols from the language (operators and predicates) ;
 One equality symbol per type (predicate with 2 arguments of
the same type) abusively all written = ;
 Logical connectives (1.4, listed in 1.6) ;

The conditional operator may
be introduced for abbreviation (2.4).
 Binders (1.8):
 Quantifiers
∀ and ∃ (1.9) are the only primitive binders of firstorder logic ;
 More binders will be introduced in set theory.
Any expression will give (define) a value (either an object or a Boolean)
for each possible data of
 A system interpreting the given types and structure symbols;
 Fixed values of available free variables in this system.
In firstorder logic, let us call logical symbols the quantifiers and symbols of
paraoperators outside the language (equality, connectives and conditional operator):
their list and their meaning in each system are determined by the logical framework and
the given types list, which is why they are not listed as components of individual theories.
Let us sketch a more precise description (the case of expressions with only free
variables and operator symbols, called algebraic terms, will be formalized in
set theory in 4.1 for only one type).
Each expression contains a special occurrence of a symbol called its root,
while each other occurrence is the root of a unique subexpression
(another expression which we may call the subexpression of that occurrence).
The type of an expression, which will be the type of its values, is given
by the type of result of
its root. Expressions with Boolean type are formulas; others, whose type belongs
to the given types list, are terms (their values will be objects).
Expressions are built successively, in parallel between different lists of available free variables.
The first and simplest ones are made of just one symbol (as root, having a value by
itself) : constants and variables are the first terms; the
Boolean constants 1 and 0 are the simplest formulas.
The next expressions are then successively built as made of the
following data:
 A choice of root, occurrence of either a paraoperator symbol (beyond constants
we already mentioned) or a binder;
 If the root is a binder: a choice of variable symbol, to be bound by it;
 A list of previously built expressions, whose format (number and
types of entries) is determined by the root : for a
paraoperator symbol, this format is given by its list of its arguments.
Display conventions
The display of this list of subexpressions directly attached to the root requires a
choice of convention. For a paraoperator symbol other than constants :
 Most binary paraoperator symbols are displayed as one
character between (separating) both arguments, such as in x+y
 Symbols with higher arities can be similarly displayed by
several characters separating the entries, such as in the addition
x+y+z of 3 numbers.
 Functionlike displays, such as +(x,y) instead of
x+y, are more usual for arities other than 2 ;
parenthesis may be omitted when arities are known (Polish notation).

A few symbols «appear» only implicitly by their special way of putting
their arguments together : multiplication in xy, exponentiation in
x^{n}.

Parenthesis can be part of the notation of a
symbol (function evaluator, tuples...).
Parenthesis can also be used to distinguish (separate) the subexpressions, thus
distinguish the root of each expression from other occurring symbols. For example
the root of (x+y)^{n} is the exponentiation operator.
The role of axioms
An expression is ground if its list of available free variables is empty,
so that its value only depends on the system where it is interpreted.
In firstorder logic, a statement is a ground formula (which may be
true or false depending on the system).
The axioms list of a theory is a set of statements, meant as stating the
truth of these statements in intended model(s). Depending on intentions (discussed
in 1.A),
this may either mean to describe a presumably given model or range of models,
or to define the range of accepted models as the class of all systems where all
axioms are true, so selected from the range of all possible systems structured
by the given language, (rejecting others where some axiom is false).
Variable structures
Usually, only few objects are named by the constants in a given language.
Any other objects can be named by a fixed variable, whose status
depends on the choice of theory to see it:
 An ordinary variable symbol, usable by expressions which by a binder
can let it range over all objects of its type;
 A new constant symbol, to be added to the language, forming another
theory with a richer language;
The difference vanishes in generic interpretations which turn
constant symbols into variables (whose values define different models).
By similarity to constants which are particular structures (nullary operators),
the concept of variable can be generalized to that of variable structure.
But those beyond nullary operations (ordinary variables) escape the above list
of allowed symbols in expressions. Still some specific kinds of use of variable
structure symbols can be justified as abbreviations (indirect descriptions)
of the use of legitimate expressions. The main case of this is explained below,
forming a development
of the theory ; more possible uses will be introduced in 1.9 (the view of a
use as an abbreviation of other works amounts to using a developed version of
the logical framework).
Structures defined by expressions
In any theory, one can legitimately introduce a symbol meant either as a variable
structure (operator or predicate) or a Boolean variable (nullary predicate, not a
"structure"), as abbreviation of, thus defined by, the following data :
 An expression (terms define operators, while formulas define predicates and Booleans);
 Among its available free variables, a selection of those which will be bound
by this definition in the role of arguments of the intended structure;
the rest of them, which remain free, are called parameters;
 Each of its possible values as a structure or a Boolean comes by
fixing the values of parameters.
The variability of this structure is the abbreviation of the variability of all its parameters.
In set theory, any function f is synonymous with the functor defined by the term
f(x) with argument x and parameter f (but the domain of this
functor is Dom f instead of a type).
The terms without argument define simple objects (nullary operators) ; the one made
of a variable of a given type, seen as parameter, suffices to give all objects of its type.
Now let us declare (the range of) the metanotion of "structure" in onemodel theory,
and thus those of "operator" and "predicate", as having to include at least
all those reachable in this way: defined by any expression with any possible
values of parameters. The minimal version of such a metanotion can be
formalized as a role given to the data of an expression with values of its
parameters. As this involves the infinite set of all expressions, it is usually
inaccessible by the described theory itself : no single expression can suffice.
Still when interpreting this in set theory, more operations between interpreted
types (undefinable ones) usually exist in the universe. Among the few
exceptions, the full set theoretical range of a variable structure with all
arguments ranging over finite sets (as interpreted types) with given size
limits, can be reached by one expression whose size depends on these limits.
Invariant structures
An invariant structure of a given system (interpreted language), is a
structure defined from its language without parameters (thus a constant one).
This distinction of invariant structures from other structures, generalizes
the distinction between constants and variables, both to cases of
nonzero arity, and to what can be defined by expressions instead of
directly named in the language. Indeed any structure named by a symbol
in the language is directly defined by it without parameter, and thus invariant.
As will be further discussed in 4.8, theories can be developed by definitions,
which consists in naming another invariant structure by a new symbol added to
the language. Among aspects of the preserved meaning of the theory,
are the metanotions of structure, invariant structure, and the range of theorems
expressible with the previous language.
1.6. Logical connectives
We defined earlier the concept of
logical connective. Let us now list the main useful ones, beyond both nullary ones
(Boolean constants) 1 and 0. (To this will be added
the conditional connective in 2.1).
Their properties will be expressed by tautologies, which are formulas only
involving connectives and Boolean
variables (here written A, B, C), and true for all possible
combinations of values of these variables.
So, they also give necessarily true formulas when replacing these variables by any defining formulas using any
language and interpreted in any systems. Such definitions of Boolean variables by
formulas of a theory may restrict their ranges of possible values depending on each other.
Tautologies form the rules of Boolean algebra, an algebraic theory describing
operations on the Boolean type, interpreted as the pair of elements 0 and 1 but also
admiting more sophisticated interpretations beyond the scope of this chapter.
Statements or formulas having like tautologies the property of being always true by virtue
of the logical framework are called logically valid ; they are necessary theorems of
any theory regardless of axioms
where they need not appear.
More logically valid formulas will be given in 1.9 (axioms of equality).
The binary connective of equality between Booleans is written ⇔ and
called equivalence: A ⇔ B is read «A
is equivalent to B».
Negation
The only useful unary connective is the negation ¬, that
exchanges Booleans (¬A is read «not A»):
It is often denoted by barring the root of its argument, forming
with it another symbol with the same format:
x ≠ y
x ∉ E (A ⇎ B)

⇔ ¬(x = y)
⇔ ¬(x ∈ E) ⇔ (A ⇔ ¬B)

(x is not equal to y)
(x is not an element of E)
(inequivalence) 
Conjunctions, disjunctions
The conjunction ∧ means «and», being true only when both
arguments are true;
The disjunction ∨ means «or», being true except when both
arguments are false.
Each of them is :
Idempotent
(A ∧ A) ⇔ A
(A ∨ A) ⇔ A

Commutative
(B∧A) ⇔ (A∧B)
(B∨A) ⇔ (A∨B)

Associative
((A∧B)∧C) ⇔ (A∧(B∧C))
((A∨B)∨C) ⇔ (A∨(B∨C))

Distributive over the other
(A ∧ (B∨C)) ⇔ ((A∧B)
∨ (A∧C)) (A ∨ (B∧C)) ⇔ ((A∨B)
∧ (A∨C))

This similarity (symmetry) of their properties comes from the fact they are
exchanged by negation:
(A ∨ B)
⇎ (¬A ∧ ¬B)
(A ∧ B) ⇎ (¬A ∨ ¬B)
The inequivalence is
also called exclusive or because (A
⇎ B) ⇔ ((A ∨ B) ∧ ¬(A
∧ B)).
Chains of conjunctions such as (A ∧ B ∧ C),
abbreviate any formula with more parenthesis such as ((A ∧ B)
∧ C), all equivalent by
associativity ; similarly for chains of disjunctions such as (A
∨ B ∨ C).
Asserting (declaring as true) a conjunction of formulas amounts to successively
asserting all these formulas.
Implication
The binary connective of implication ⇒ is defined as (A ⇒ B) ⇔ ((¬A)
∨ B). It can be read «A implies B», «A
is a sufficient condition for B», or «B is a
necessary condition for A». Being true except when A
is true and B is false, it expresses the truth of B
when A is true, but no more gives information on B
when A is false (as it is then true).
Moreover,
(A ⇒ B) ⇎
(A ⇒ B) ⇔ 
(A ∧ ¬B) (¬B ⇒ ¬A) 
The formula ¬B ⇒ ¬A is called the contrapositive
of A ⇒ B.
The equivalence can also be redefined as
(A ⇔ B) ⇔ ((A ⇒ B) ∧ (B ⇒ A)).
Thus in a given theory, a proof of A ⇔ B can be
formed of a proof of the first implication (A ⇒ B), then a
proof of the second one (B ⇒ A), called the converse of (A
⇒ B).
The formula A ∧ (A ⇒ B) is equivalent to A ∧ B
but will be written A ∴ B, which reads «A therefore B», to
indicate that it is deduced from the truths of A and A ⇒ B.
Negations turn the associativity and distributivity of ∧ and ∨, into various
tautologies involving implications:
(A ⇒ (B ⇒ C)) ⇔ ((A ∧ B) ⇒ C)
(A ⇒ (B ∨ C))
⇔ ((A ⇒ B) ∨ C)
(A ⇒ (B ∧ C)) ⇔ ((A ⇒ B) ∧ (A ⇒ C))
((A ∨ B) ⇒ C) ⇔
((A ⇒ C) ∧ (B ⇒ C))
((A ⇒ B) ⇒ C) ⇔
((A ∨ C) ∧ (B ⇒ C))
(A ∧ (B ⇒ C)) ⇔
((A ⇒ B) ⇒ (A ∧ C))
Finally, (A ⇒ B)
⇒ ((A∧C) ⇒ (B∧C))
(A ⇒ B) ⇒
((A∨C) ⇒ (B∨C)).
Chains of implications and equivalences
In a different kind of abbreviation, any chain of formulas
linked by ⇔ and/or ⇒ will mean the chain of conjunctions of these
implications or equivalences between adjacent formulas:
(A ⇒ B
⇒ C) ⇔ ((A ⇒ B) ∧ (B ⇒ C)) ⇒
(A ⇒ C)
(A ⇔ B ⇔ C) ⇔ ((A ⇔ B) ∧ (B
⇔ C)) ⇒ (A ⇔ C)
0 ⇒ A ⇒ A ⇒ 1
(¬A) ⇔ (A ⇒ 0) ⇔ (A ⇔ 0)
(A ∧ 1) ⇔ A ⇔ (A ∨ 0) ⇔ (1 ⇒ A) ⇔ (A
⇔ 1)
(A ∧ B) ⇒ A ⇒ (A ∨ B)
Provability
In this work like almost anywhere else, proofs will be written naturally without
trying to formalize the concept of proof (write the full details of a proof theory,
which specialists did). Sometimes using natural language articulations, they
will mainly be written as a succession of statements each visibly true thanks to
previous ones, premises, axioms, known theorems and rules of proof,
especially those of quantifiers (1.9).
To qualify known theorems (statements with known proofs), synonyms for "theorem"
are traditionally used according to their importance: a theorem is more
important than a proposition; either of them may be deduced from an
otherwise less important lemma, and easily implies an also less important
corollary. A proof of a statement A in a theory T, is
a finite model of proof theory, involving A and some axioms of T; it
can also be seen as a proof of (conjunction of these axioms ⇒ A) without axiom,
so that the existence of such a system ensures the logical validity of this implication.
But, beyond tautologies, logically valid statements with some size may be only provable
by proofs with indescribably larger size.
We say that A is provable in T and write T ⊢
A if there exists a proof of A in T.
Again in T, a refutation of A is a proof of ¬A. If one exists
(T ⊢ ¬A), the statement A is called refutable (in T).
A statement is called decidable (in T) if it is either provable or refutable.
A theory T is called contradictory or inconsistent
if T ⊢ 0, otherwise it is called consistent. If a statement is both provable and
refutable in T then all are, because it means that T is inconsistent, independently
of the chosen statement:
(A ∧ ¬A) ⇔ 0
((T ⊢ A) ∧ (T ⊢ B))
⇔ (T ⊢ A∧B)
((T ⊢ A) ∧ (T ⊢ ¬A)) ⇔ (T ⊢ 0).
In an inconsistent theory, every statement is provable. Such a theory has no
model.
1.7. Classes in set theory
For any theory, a class is a
unary predicate
A seen as the set of objects where A is true, that is «the
class of all x such that A(x)».
In particular
for set theory, each set E is synonymous with the class of the x
such that x∈E (defined by the formula
x∈E with argument x and parameter E).
However, this involves two different interpretations of the notion of set,
that need to be distinguished by the following means.
The unified framework of theories
Attempts to formalize onemodel theory in firstorder logic cannot completely specify
the metanotions of «expressions» and «proofs». Indeed as will be explained in 4.7
(Nonstandard
models of Arithmetic), any firstorder theory aiming to describe finite systems
without size limit (such as expressions and proofs) inside its model (as classes
included in a type), will still admit in some models some pseudofinite ones,
which are infinite systems it mistakes as «finite» though sees them larger than any size
it can describe (as the latter is an infinity of properties which it cannot express as a
whole to detect the contradiction ; these systems will also be called nonstandard
as «truly finite» will be the particular meaning of «standard» when qualifying kinds of
systems which should normally be finite).
To fill this gap will require a secondorder universal quantifier
(1.9), whose meaning is best expressed (in appearance though not really
completely) after insertion
in set theory (whose concept of finiteness will be defined in 4.5). As this insertion
turns its components into free variables whose values define its model [T,
M], their variability removes its main difference with model theory
(the other difference is that model theory can also describe theories without models).
This view of model theory as developed from set theory, will be exposed in
Parts 3 and 4, completing the grand tour of the foundations of mathematics
after the formalization
of set theory in a logical framework.
Given a theory T so described, let T_{0} be the external
theory, also inserted in set theory, which looks like a copy of T as any
component k
of T_{0} has a copy as an object serving as a component of T.
In some proper formalization, T_{0} can be defined from T as
made of the k such that («k» ∈ T) is true, where the notation
«k» abbreviates a term of set theory designating k
as an object, and the truth of this formula means that the value of this term
in the universe belongs to T.
This forms a convenient unified framework for describing theories interpreted
in models, encompassing both previous ones (settheoretical and modeltheoretical):
all works of the theory T_{0} (expressions, proofs and other
developments), have copies as objects formally described by the model theoretical
development of set theory as works of the theory T. In the same
universe, any system M described as a model of T is indirectly
also a (settheoretical) model of T_{0}.
This powerful framework is bound to the following limits :
 Starting from an arbitrary theory T_{0} with infinitely many
components, a corresponding T cannot be directly defined, as a definition
must be expressible by finite means and thus cannot depend on an infinity of
metaobjects. Any infinite list of components of T_{0} must be
defined by some rules, for getting T as defined by the same rules.
 The risk for a universe to be nonstandard in the sense of containing
pseudofinite systems (which formalizations of set theory in firstorder logic
cannot prevent), leads to the following possible discrepances between
T_{0} and T :
 T_{0} only copies the standard components of
T, ignoring nonstandard ones that usually exist for infinite theories
in a nonstandard universe. Then a model of T_{0}
in this universe may fail to be a model of T by not fitting some
nonstandard components (axioms) of T.
 By the incompleteness theorem, T may be inconsistent while
T_{0} is consistent, either due to nonstandard axioms of
T, or to a nonstandard contradiction between standard axioms, which
may exist even if these theories are identical (having a finite initial content).
As such a contradiction means that T has no model in this universe
(nonstandard «proofs» are only valid in the universe containing them),
models of T_{0} may either have the above
failure (in the infinite case), or only exist outside this universe.
So understood, the conditions of use of this unified framework
of theories, are usually accepted as legitimate assumptions,
by focusing on welldescribed theories (though no welldescribed set
theory can be the "ultimate" one as mentioned below), interpreted in standard
universes whose existence is admitted on philosophical grounds;
this will be further discussed in philosophical pages.
Standard universes and metasets
From
now on, in the above unified framework, the theory T_{0}
describing M and idealized as an object T, will be set theory
itself. Taking it as an identical copy of the set theory serving as framework,
amounts to taking the same set theory interpreted by two universes, that will
be distinguished by giving the meta prefix to the interpretation in the role of
framework.
Aside generic
interpretations, set theory has a standard kind of interpretation into itself where
each set is interpreted by the class (metaset) of its elements (the synonymous object
and metaobject are
now equal), and each function is interpreted by its synonymous metafunction
(see more details
with how it relates with finiteness).
This way, any set will be a class, while any class is a metaset of objects. But some
metasets of objects are not classes (no formula with parameters can define them);
and some classes are not sets, such as the class of all sets (see Russell's
paradox in 1.8), and the universe (class of all objects, defined by 1).
A kind of theoretical difference between both uses of set theory will turn out to be
irreducible (by the incompleteness theorem): for any given (invariant) formalization
of set theory, the existence of a model of it (universe), or equivalently its consistency,
formalized as a set theoretical statement with the meta interpretation, cannot be
logically deduced (a theorem) from the same axioms. This statement, and thus
also the stronger statement of the existence of a standard universe,
thus forms an additional axiom of the set theory so used as framework.
Definiteness classes
Set theory accepts all objects as «elements» that can belong to sets and be
operated by functions (to avoid endless further divisions between sets of
elements, sets of sets, sets of functions, mixed sets...). This might be formalized
keeping 3 types (elements, sets and functions), where each set would have a
copy as element, identified by a functor from sets to elements, and the same for
functions. But beyond these types, our set theory will anyway need more
notions as domains of its structures, which can only be conveniently
formalized as classes. So, the notions
of set and function will also be classes named by predicate symbols:
Set = «is a set»
Fnc = «is a function»
In firstorder logic, any expression is ensured to take a definite value, for
every data of a model and values of all free variables there (by virtue of its
syntactic correction, that is implicit in the concept of «expression»).
But in set theory, this may still depend on the values of free variables.
So, an expression A (and any structure defined from it) will be called
definite, if it actually takes a value for the given values of its free variables
(seen as arguments and parameters of any structure it defines). This condition
is itself an everywhere definite
predicate, expressed by a formula dA with the same free variables.
Choosing one of these as argument, the class it defines is the metadomain,
called class of definiteness, of the unary structure defined by A.
Expressions should be only used where they are definite, which will be done
rather naturally. The definiteness condition of (x ∈ E) is Set(E).
That of the function evaluator f(x) is Fnc(f) ∧
x ∈ Dom f.
But the definiteness of the last formula needs a justification, given below.
Extended definiteness
A theory with partially definite structures, like set theory, can be formalized (translated) as
a theory with one type and everywhere definite structures, keeping intact all expressions
and their values wherever they are definite : models are translated one way by giving
arbitrary values to indefinite structures (e.g. a constant value), and in the way back by
ignoring those values. Thus, an expression with an indefinite subexpression may be
declared definite if its final value does not depend on these extra values.
In particular for any formulas A and B, we shall regard the formulas
A ∧ B and A ⇒ B as definite if A
is false, with respective values 0 and 1, even if B is not definite. So,
let us give them the same definiteness condition
dA ∧ (A ⇒ dB)
(breaking, for A ∧ B, the symmetry between A
and B, that needs not be restored). This formula is made definite by the
same rule provided that dA and dB were definite. This way, both formulas
A ∧ (B ∧ C)
(A ∧ B) ∧ C
have the same definiteness condition (dA ∧
(A ⇒ (dB ∧ (B ⇒ dC)))).
Classes will be defined by everywhere definite predicates, easily expressible by
the above rule as follows.
Any predicate A can be extended beyond its
domain of definiteness, in the form dA ∧ A (giving 0), or
dA ⇒ A (giving 1).
For any class A and any unary predicate B definite in all A, the class defined
by A∧B is called the subclass of A
defined by B.
1.8. Binders in set theory
The syntax of binders
This last kind of symbol can form
an expression by taking a variable symbol, say here x, and an
expression F which may use x as a free variable (in addition to the free
variables that are available outside), to give a value depending on the unary structure
defined by F with argument x. Thus, it separates the Â«insideÂ»
subexpression F having x among its free variables, from the Â«outsideÂ»
where x is bound. But in most cases (in most theories...),
binders cannot keep the full information on this unary structure, which is too complex
to be recorded as an object as we shall see below.
We shall first review both main binders of set theory : the setbuilder and the function
definer. Then 1.9 will present both main quantifiers. Finally 1.10 and 1.11 will
give axioms to complete this formalization of the notions of sets and functions in set theory.
The syntax differs between firstorder logic and set theory, which manage the
ranges of variables differently. In firstorder logic, ranges are types,
implicit data of quantifiers. But the ranges of binders of set theory are sets
which, as objects, are designated by an additional argument of the binder
(a space for a term not using the variable being bound).
Setbuilder
For any unary predicate A definite on all elements of a set E, the
subclass
of E defined by A is a set : it is the range of a
variable x introduced as ranging over E, so that it
can be bound, from which we select the values satisfying
A(x). It is thus a subset of E, written
{x∈E  A(x)} (set of all x
in E such that A(x)): for all y,
y ∈ {x∈E  A(x)}
⇔ (y ∈ E ∧ A(y))
This combination of characters { ∈  } forms the notation of a binder
named the setbuilder: {x∈E  A(x)}
binds x with range E on the formula A.
Russell's paradox
If the universe (class of all elements) was a set then, using it, the set
builder could turn all other classes, such as the class of all sets, into sets.
But this is impossible as can be proven using the setbuilder itself :
Theorem. For any set E there is a set F
such that F ∉ E. So, no set E can contains all sets.
Proof. F={x∈E  Set(x) ∧ x ∉ x}
⇒ (F ∈ F ⇔ (F ∈ E ∧ F ∉ F))
⇒ (F ∉ F ∴ F ∉ E). ∎
This will oblige us to keep the distinctions between sets and
classes.
The function definer
The function definer ( ∋ ↦ ) binds a variable on a term,
following the syntax E ∋ x ↦ t(x),
where  x is the variable,
 E is its range,
 the notation
"t(x)" stands for any term, here abbreviated (to describe
the general case) using the functor symbol t defined by this term with
argument x (and possible parameters here kept implicit).
Being definite if t(x) is definite for all x in E,
it takes then the functor t and restricts its domain (definiteness class) to the
set E, to give a function with domain E. So it converts functors into
functions, reversing the action of the function evaluator (with the Dom functor)
that converted (interpreted) functions into their role (meaning) as functors
whose definiteness classes were sets.
The shorter notation x ↦ t(x) may be used when
E is determined by the context, or in a meta description to designate a
functor by specifying the argument x of its defining term.
Relations
A relation is a role playing object of set theory similar to an operation
but with Boolean values : it acts as a predicate whose definiteness
classes (ranges of arguments) are sets (so, predicates could be
described as relations between interpreted types). Now unary relations
(functions with Boolean values), will be represented as subsets S
of their domain E, using the setbuilder in the role of definer, and ∈
in the role of evaluator interpreting S as the predicate
x â†¦ (x ∈ S).
This role of S still differs from the intended unary relation, as it ignores its domain
E but is definite in the whole universe, giving 0 outside E.
This lack of operator Dom does not matter, as E is usually known
from the context (as an available variable).
As the function definer (resp. the setbuilder) records the whole
structure defined by the given expression on the given set, it
suffices to define any other binder of set theory
on the same expression with the
same domain, as made of a unary structure applied to its
result (that is a function, resp. a set).
Structure definers in diverse theories
Let us call structure definer any binder B which, used on
diverse expressions A, faithfully records the unary structure defined
by A on some range E (type or class defined by an argument
here implicit), i.e. its result S = (Bx, A(x)) can
restore this structure by an evaluator V (symbol or expression) :
V(S, x) = A(x) for all x in E.
Admitting the use of negation and the possibility to interpret
Booleans by objects (in a range with at least 2 objects, which is often
the case), Russell's paradox shows that adding both following requirements on
a structure definer in a theory would lead to contradiction :
 All such S belong to E
 V can occur in the expression A and use x anyhow in
its arguments, namely V(x, x) is allowed, which makes
sense as 1. ensures the definiteness of any V(S, S).
Let us list the remaining options. Set theory rejects 1. but keeps 2. But since 1.
is rejected, keeping 2. may be or not be an issue depending on further details.
As will be explained in 4.9, extending a generic theory (whose ranges of binders
were the types) by a new type K given as the set of all structures
defined by a fixed expression A for all combinations of values of its
parameters, forms a legitimate development of the theory (a construction).
Indeed a binder on a variable structure symbol S with such a range K
abbreviates a successive use of binders on all the parameters of A
which replaces S. Here A and the system
interpreting it come first, then the range K of the resulting S
and their interpretations by V are created outside
them : A has no subterm with type K, thus does
not use V (which has an argument of type K).
The notion of structure in firstorder logic (as a onemodel theory) has this similarity
with the notion of set in set theory : for each given symbol type beyond constants,
the class of all structures of that type is usually not a set, calling "sets" such
ranges K (of structures defined by a fixed expression with variable
parameters), or subclasses of these.
The fully developed
theory with the infinity of such new types constructed for all possible expressions A,
can become similar to set theory by gathering to a single type U all constructed types
K of variable structures of the same symbol type (structures over the same sets),
interpreted by the same symbol V (which could be already used by A).
This merely packs into V different structures without conflict
since they come from different types K of the first argument.
This remains innocent (rewriting what can be done without it) as long as in
the new theory, the binders of type U stay restricted to one of these
"sets" K (or covered by finitely many of them, which are actually
included in another one).
In set theory, the ranges of binders
are the sets. Thus, beyond the simplifying advantage of removing types,
set theory will get more power when accepting more classes as sets.
Other theories, which we shall ignore in the rest of this work, follow more
daring options:
 Keeping 1. and rejecting 2. will be shown consistent by
Skolem's Paradox (4.6)
but would be quite unnatural.

Even weirder is NF
("New Foundations", so called as it was new when first published in 1937),
combining 1. with a lighter version of 2. restricting the possible syntax of
A to forbid occurrences of (xâˆˆx) or any way to define it.
 The most extreme is lambda
calculus, that keeps both points but
tolerates the resulting contradiction by ignoring Boolean logic with its
concept of "contradiction". This "theory" does not describe any object but
only its own terms, seen as computable functions. As a computation system,
its contradictions are computations which keep running, never giving any
results.
1.9. Quantifiers
A quantifier is a binder taking a unary predicate (formula) and giving a
Boolean value.
In set theory, the full syntax for a quantifier Q binding a
variable x with range E on a unary predicate A,
is
Qx∈E, A(x)
where A(x) abbreviates the formula defining A,
whose free variables are x and possible parameters.
A shorter notation puts the range as an index (Q_{E}x,
A(x)), or deletes it altogether (Qx, A(x))
when it may be kept implicit (unimportant, or fixed by the context, such
as a type in a generic theory).
The two main quantifiers (from which others will be defined later) are:
 The universal quantifier ∀, read as «For all x (in...),... »).

The existential quantifier ∃, read as «There exists x (in...) such that... »
The universal quantifier of set theory can be seen as defined from the set builder:
(∀x∈E, A(x)) ⇔ {x∈E
 A(x)} = E.
The one of firstorder logic can be defined in set theoretical interpretations,
seeing A as a function and its Boolean values as objects:
(∀x, A(x)) ⇔ A = (x ↦ 1)
Anyway (∀x, 1) is always true.
∃ can be defined from ∀ with the same range :
(∃x, A(x))
⇎ (∀x, ¬A(x)).
Thus (∃x, A(x)) ⇔
A ≠ (x ↦ 0).
With classes,
(∃_{C } x, A(x)) 
⇔ (∃x, C(x) ∧ A(x))
⇔ ∃_{C∧A} x, 1 
(∀_{C } x, A(x)) 
⇔ (∀x, C(x) ⇒ A(x)) 
∀x, C(x)

⇔ ∃_{C } y, x=y

Inclusion between classes
A class A is said to be included in a class
B when ∀x, A(x) ⇒ B(x). Then
A is a
subclass of B, as ∀x, A(x) ⇔ (B(x)
∧ A(x)). Conversely, any subclass of B is
included in B.
The inclusion of A in B
implies for any predicate C (in cases of definiteness):
(∀_{B}
x, C(x)) ⇒ (∀_{A} x,
C(x))
(∃_{A} x, C(x)) ⇒ (∃_{B}
x, C(x))
(∃_{C }x, A(x)) ⇒ (∃_{C
}x, B(x))
(∀_{C }x, A(x)) ⇒ (∀_{C
}x, B(x))
Rules of proofs for quantifiers on a unary predicate
Existential Introduction. If we have terms t,
t′,… and a proof of (A(t) ∨ A(t′)
∨ …), then ∃x, A(x).
Existential Elimination. If ∃x, A(x),
then we can introduce a new free variable z with the
hypothesis A(z) (the consequences will be
true without restricting the generality).
These rules express the meaning of ∃ : going from some term to ∃ then from
∃ to z, amounts to letting z represent that term
(without knowing which, but they can be gathered into one by the conditional operator). They give
the same meaning to ∃_{C } as to its 2 above
equivalent formulas, bypassing (making implicit) the extended
definiteness rule for (C ∧ A) by focusing on the
case when C(x) is true and thus A(x)
is definite.
While ∃ appeared as the designation of an object, ∀ appears as a
deduction rule: ∀_{C }x, A(x)
means that its negation ∃_{C }x, ¬A(x)
leads to a contradiction.
Universal Introduction. If from the mere hypothesis C(x)
on a new free variable x we could deduce A(x),
then ∀_{C }x, A(x).
Universal Elimination. If ∀_{C }x,
A(x) and t is a term satisfying C(t),
then A(t).
Introducing then eliminating ∀ is like replacing x by t
in the initial proof.
These rules can be used to justify the following logically valid formulas
((∀_{C }x, A(x))
∧ (∀_{C }x, A(x) ⇒ B(x)))
⇒ (∀_{C }x, B(x))
((∃_{C }x, A(x))
∧ (∀_{C }x, A(x) ⇒ B(x)))
⇒ (∃_{C }x, B(x))
((∀_{C }x, A(x))
∧ (∃_{C }x, B(x))) ⇒
(∃_{C }x, A(x) ∧
B(x))
(∀_{C }x,
A(x)∨B(x)) ⇒ ((∀_{C }x,
A(x)) ∨ (∃_{C }x,
B(x)))
(∃_{A} x, ∀_{B}
y, R(x,y)) ⇒ (∀_{B} y,
∃_{A} x, R(x,y))
∀x, (∀y, R(x,y)) ⇒
R(x,x) ⇒ (∃y, R(x,y))
The formula (∃_{C }x, A(x)) ∧
(∀_{C }x, A(x) ⇒ B(x))
will be abbreviated as (∃_{C }x, A(x) ∴
B(x)) while it implies but is not generally equivalent to
(∃_{C }x, A(x) ∧ B(x)).
Secondorder quantification
Let us call secondorder quantifier, a quantifier binding a variable structure symbol
over the range of all structures of its symbol
type, may this be conceived as the
range of all definable ones (with all possible defining expressions whose free variables may have
any list of parameters beyond the given arguments) or as the full set theoretical
range, that is the range of all such structures which exist in the universe, relating the given interpreted types.
The use of such a quantifier (and thus of variable structure symbols) is not allowed in firstorder logic,
but belongs to some other logical frameworks instead, such as secondorder logic (part 5).
Still while keeping firstorder logic as the main framework of a given theory,
some secondorder quantifiers may be used to describe some of its
meta level aspects in the following ways (which will be
involved in the formalization of set theory in 1.10 and 1.11). Let T be
a firstorder theory, T' its extension by a structure symbol s
(without further axiom) and F a ground formula of T' (in firstorder logic)
also denoted F(s) when seen as a formula of T using the
variable structure symbol s in secondorder logic.
Secondorder Universal Introduction. If T'⊢F then T
entails the secondorder statement (∀s, F(s)).
This holds for any model and the full set theoretical range of s, independently of
the universe in which models and structures s are searched for.
Secondorder Universal Elimination. Once a secondorder statement
(∀s, F(s)) is accepted in a theory T,
it is manifested in firstorder logic as a schema of statements,
that is an infinite list of statements of the form (∀parameters, F(s)) for each
possible replacement of s by a defining expression with parameters.
Applying secondorder universal elimination after secondorder universal introduction,
means deducing from T a schema of theorems, each one indeed deducible
in firstorder logic by the proof obtained from the original proof by replacing s by its definition.
In secondorder logic, a new binder B can be defined by an expression here abbreviated
as F(A) using a symbol A of variable unary structure whose argument will be
bound by B:
∀A, (Bx, A(x)) = F(A)
By secondorder universal elimination, this comes down to a schema of definitions in firstorder logic :
for each expression defining A, it defines (Bx, A(x)) like a structure symbol,
by the expression F(A) whose available free variables are the parameters
of F plus those of A.
Axioms of equality
In firstorder logic with given types and a given language, some ground formulas involving
= are logically valid for the range of interpretations keeping = as the = predicate
of set theory, but no more for the larger range of interpretations letting
it become any other predicate. A possible list of axioms of equality, is a list of
some of such formulas which suffice to imply all others in this context. One such list
consists in the following 2 axioms per type, where the latter is meant as an axiom schema
by secondorder universal elimination of the variable unary predicate A:
 ∀x, x = x (reflexivity)

∀A,∀x,∀y, (x = y ∧ A(y)) ⇒ A(x).
Variables x and y can also serve among parameters in definitions of A.
This can be understood by reordering quantifiers as (∀x,
∀y, ∀A), or as deduced from cases only using other free variables a, b, by
adapting an above logically valid formula as ∀a, ∀b,
(∀x, ∀y, R(a, b, x,y)) ⇒ R(a, b, a,b).
Diverse definitions of A give diverse formulas (assuming reflexivity):
Formula
3. ∀x,∀y, x = y ⇒ y = x
4. ∀x, ∀y, ∀z, (x = y ∧ y = z) ⇒
x = z
5. ∀f, ∀x, ∀y, x = y ⇒ f(x) = f(y)
6. ∀A, ∀x, ∀y, x = y ⇒ (A(x) ⇔ A(y))
7. ∀x, ∀y, ∀z, (x = y ∧ z = y) ⇒
z = x 
Name
Symmetry Transitivity

A(u) used
y = u u = z
f(u) = f(y)
¬A(u) z = u 
We shall abbreviate (x = y
∧ y = z) as x = y = z.
5. is an axiom schema with f ranging among functors between any two types.
6. can also be deduced from symmetry.
Remark. (1.∧7.) ⇒ 3., then 3. ⇒ (4. ⇔ 7) so that (1.∧7.) ⇔ (1.∧3.∧4.).
Another possible list of axioms of equality consists in formulas 1. to 5. where f and
A range over the mere symbols of the language, each taken once per argument :
the full scheme of 2. is implied by successive deductions for each occurrence of symbol
in A. This will be further justified in 2.9 (equivalence
relations).
Introducing a variable x defined by a term t by writing (x = t ⇒ ...),
in other words putting the axiom x = t, can be
seen as justified by the above rules in this way : t = t ∴
∃x, (x = t ∴ ...).
1.10. Formalization of set theory
The role of axioms
A list of axioms for a theory, selected among its basic accepted true statements,
normally aims to specify by their consequences the range of valid (provable)
statements of this theory, against other possible theories, expressible in the same
framework, for which these statements may be false since they are not logically valid.
But set theory being expressed in a special logical framework not used for other theories,
this leaves a priori unclear the distinction between its logically valid statements, and
its other basic accepted truths which need to be declared as axioms.
But such a distinction appears by converting set theory into a generic theory: let
us call axioms of set theory a list of those of its basic statements whose
translated versions in firstorder logic are not logically valid, but form a proper list of
axioms for that generic theory to be equivalent to the intended set theory.
Converting binders
Binders are the only symbols whose format of
use differs between both frameworks. Let us describe the rules to convert them
from set theory into firstorder logic.
The function definer becomes an infinity of operator symbols: for each term
t with one argument (and parameters), the whole term (E ∋
x ↦ t(x)) is seen as the big name of a distinct operator
symbol, whose arguments are E and the parameters of t. (Those where
every subexpression of t without any occurrence of x is the only
occurrence of a parameter, would suffice to define others).
The same goes for
the setbuilder, which will come as a particular case in 1.11.
The conversion of
quantifiers comes by expressing
their domains as classes :
(âˆƒxâˆˆE,
A(x)) â†’ (âˆƒx, x âˆˆ E âˆ§ A(x))
(âˆ€xâˆˆE, A(x)) â†’ (âˆ€x, x
âˆˆ E â‡’ A(x))
Open quantifiers, formulas vs statements
Set theory only admits quantifiers ranging over sets (of the form âˆƒxâˆˆE
and âˆ€xâˆˆE), called bounded quantifiers, in its formulas
(also called bounded formulas for insistence) that define predicates
and can be used as subformulas of terms by the setbuilder. But its translated
form as a generic theory allows for more formulas, with quantifiers ranging over
other classes (or the universe) we shall call open quantifiers.
In set theory, the use of open quantifiers will be reserved for the expression
of statements, declarared to be true as axioms or theorems.
Precisely, each statement will be made of a chain of open
quantifiers, usually all âˆ€ and often written in words ("for all"), before
a bounded formula. Proofs will naturally use the deduction rules for open
quantifiers (introduction and elimination) by common language articulations.
The inclusion predicate
The inclusion predicate ⊂ between two sets E and F, read as
"E is included in F", or "E is a subset of F", or "F
includes E", is defined by E ⊂ F ⇔
(∀xâˆˆE, x âˆˆ F).
Properties of inclusion
between classes apply. E ⊂ E is always true
(meaning ∀x, x âˆˆ E ⇒ x âˆˆ E).
Implications chains also appear as inclusion chains:
(E ⊂ F ⊂ G) ⇔
(E ⊂ F
∧ F ⊂ G) ⇒ E ⊂ G.
Axioms for notions
The formalization of set theoretical notions as classes inside a single type, already
described in 1.7, involves the
following axioms. Here and in the below axioms for functions, the quantifier
(âˆ€t,E) is meant as declaring a schema of axioms by secondorder
universal elimination over the variable functor t with E included in the
definiteness class of the term defining t for the given values of parameters
(this is the definiteness condition for (E ∋ x ↦ t(x))).
Thus for each term defining t we have an axiom where âˆ€t,E is replaced by
âˆ€parameters, âˆ€_{Set}E, (âˆ€xâˆˆE,
dt(x, parameters)) â‡’
The distinction of notions is stated by the following axioms
âˆ€x, Â¬(Set(x) âˆ§ Fnc(x)) 
: sets are not functions 
âˆ€_{Fnc} f, Set(Dom f) 
: the domain of every function is a set 
âˆ€t,E, Fnc(E ∋ x ↦ t(x))

: any definite
(E ∋ x ↦ t(x)) is a function

Axiom of Extensionality
It says : For all sets E and F,
E âŠ‚ F âŠ‚ E ⇒ E = F.
Meaning that any two sets playing the same role are equal (somehow defining
the equality between sets by that of their roles), this axiom expresses that
sets are determined by their role. This role of a set may be understood either
as a range for quantifiers, or as a class defined by âˆˆ.
Indeed, E ⊂ F ⊂ E
means that E and F have the same elements (∀x,
x âˆˆ E ⇔ x âˆˆ F), and for any
predicate R,
(âˆ€xâˆˆF, R(x)) ⇔
(âˆ€xâˆˆE, R(x))
and similarly for ∃.
Informally, the elements of a set are given in bulk.
Axioms for functions
Here is an analogue of the axiom of extensionality for functions :
(=_{Fnc}) âˆ€_{Fnc} f, âˆ€_{Fnc} g,
(Dom f = Dom g âˆ§ âˆ€xâˆˆDom f, f(x)
= g(x)) ⇒ f = g
The function definers are related with
the function evaluator by an axiom which can be written in either way:
 âˆ€t,E, âˆ€_{Fnc} f, f = (E âˆ‹
x â†¦ t(x)) â‡” (Dom f = E âˆ§ âˆ€xâˆˆE,
f(x) = t(x))
 âˆ€t,E, Dom(E âˆ‹ x â†¦ t(x)) = E
âˆ§ âˆ€xâˆˆE, (E âˆ‹ y â†¦ t(y))(x)
= t(x)
Indeed 1.â‡”(2.âˆ§(=_{Fnc})). We can even integrate in 1. the last axiom
for notions by rewriting it as 1'. âˆ€t,E, âˆ€f,
f = (E âˆ‹ x â†¦ t(x)) â‡” (Fnc f âˆ§ Dom f =
E âˆ§ âˆ€xâˆˆE, f(x) = t(x))
The proofs of these equivalences
can be taken as exercises.
Formalizing diverse notions in set theory
Set theory will be expanded by additional components (symbols and axioms,
which may either be primitive or obtainable as developments, namely as
definitions), to directly represent diverse kinds of metaobjects as objects :
"definer" symbols will convert these metaobjects into objects. This can
only apply to metaobjects which are "objectlike" enough, in a sense yet to
be clarified (as seen in 1.8, not all classes can be sets, while only the functors
whose domains are sets are functions).
More "setlike" classes will be converted into sets (1.11), and indirectly
specified elements will become directly specified (2.4).
Metaobjects behaving as other kinds of objects beyond elements, sets and
functions, will form other notions. But as seen for unary relations (1.8), they
need not be added as primitive notions, as their role can naturally be played by
classes of already present objects : sets or functions, that will offer their expressible
features to the new objects. Beyond definers, more symbols (evaluators) will be
needed to interpret these objects back to their roles as metaobjects.
So the notions of operation, relation and tuple will be formalized in 2.1.
If several ways to represent a new notion by a class of old objects
can be useful, which do not represent each new object by the same old
object, then functors will be needed for conversion between the
classes playing this role in the different ways (2.3, 2.8).
1.11. Set generation principle
Bounded quantifiers give sets their fundamental role as ranges of
bound variables, unknown by the predicate ∈ which only lets them
play a role of classes. Technically, the bounded quantifier (∃ ∈ , )
suffices to define the predicate ∈ as
x ∈ E ⇔ (∃y∈E, x = y)
but is not generally definable from it in return by any bounded formula, as
its definition
involves an open quantifier.
Philosophically speaking, the perception of a set as a
class (ability to classify each object as belonging to it or not) does not
provide its full perception as a set (the perspective over all its
elements as coexisting).
Set generation principle. For any class C defined by a bounded formula,
if ∀_{C} is found equivalent to the quantifier Q defined by another bounded
formula (whose free variables are the parameters of C and a variable unary
predicate symbol A),
by formally proving ∀A, (∀x,C(x) ⇒ A(x)) ⇔
(Qx, A(x)) by secondorder universal introduction (under a given
condition on parameters),
then C is a set that can be named by an operator symbol K
added to the language of set theory, with these parameters as arguments,
and the following axiom which expresses K = C
by double inclusion:
For all values of parameters satisfying the condition,
Set(K) ∧ (∀x∈K, C(x)) ∧ (Qx, x
∈ K).
(The involved secondorder universal introduction can accept a proof using an already accepted
axiom schema applied to an expression using A, thus beyond the strict concept of
provability in the firstorder version of set theory ; I am not sure if this is ever used or not)
This equivalence between Q and ∀_{C } is equivalently
expressible (see proof of this
latter equivalence) as the following list of 3 statements, where the quantifier Q*
defined by (Q*x, A(x)) ⇎ (Qx,¬A(x))
will be equivalent to ∃_{C}:
(1) ∀x, (C(x) ⇔ Q*y, x=y)
(in fact we just need ∀x, C(x) ⇒ Q*y,
x=y)
(2) Qx, C(x)
(3) ∀A, ∀B, (∀x, A(x) ⇒ B(x))
⇒ ((Qx, A(x)) ⇒ (Qx, B(x))).
A list of cases of application of this principle is given by the following table. For
all of these, (1) comes by finding C to be so defined
from Q, while (3) is ensured by the absence of any "negative" occurrence
of A in Q (inside a negation, an equivalence, or left of a ⇒).
This leaves (2) as the remaining nontrivial (but easy) condition to check.
The first column recalls the above generic abbreviations while each other column
gives an effective example of expressions on which it works. The K line gives
the notations for the set theoretical symbols so introduced.
C(x) 
x∈E ∧ B(x) 
∃y∈E, x∈ y 
∃y∈Dom f, f(y)=x 
0

x=y 
x=y ∨ x=z 
Qx, A(x) 
∀x∈E, B(x)⇒A(x) 
∀y∈E, ∀x∈y, A(x) 
∀x∈Dom f, A(f(x))

1 
A(y) 
A(y) ∧ A(z) 
Q*x,A(x) 
∃x∈E, B(x) ∧ A(x) 
∃y∈E, ∃x∈y, A(x) 
∃x∈Dom f, A(f(x)) 
0

A(y) 
A(y) ∨ A(z) 
K 
{y∈E  B(y)} 
⋃E 
Im f 
∅ 
{y} 
{y,z} 
dK 
∀x∈E, dB(x) 
Set(E)∧∀x∈E, Set(x)

Fnc(f) 
1

1

1

For the first of these symbols which is the set builder, the unary predicate symbol
B is meant as the abbreviation of any formula which can define it with
parameters, so that the axiom for it given by
the set generation principle (similar to the axiom for functions) can be read as a
use of secondorder universal elimination over a ∀B ahead of it.
We first defined in 1.8 the setbuilder K={x∈E  B(x)}
as a class, thus with an open quantifier (∀x, x∈K ⇔
(x∈E ∧ B(x))) but the above shows how to write
this definition by axioms without open quantifier beyond parameters:
Set(K) ∧ (∀x∈K,
x∈E ∧ B(x)) ∧ (∀x∈E,
B(x) ⇒ x ∈ K)
or more shortly
Set(K) ∧ K ⊂ E ∧ (∀x∈E,
x ∈ K ⇔ B(x))
with which the proof of Russell's paradox would be written
∃F, (F={x∈E  Set(x)
∧ x∉x} ∴ Set(F) ∴ (F∈F
⇎ (Set(F) ∧ F∉F))) ∴ F ∉ E
The functor ⋃ is the union symbol, and its axioms form the axiom
of union.
The set Im f of values of f(x)
when x ranges over Dom f,
is called the image of f.
We define the predicate (f : E→F) as
(f : E→F) ⇔ (Fnc(f) ∧ Set(F)
∧ Dom f = E ∧ Im f ⊂ F)
that reads «f is a function from E to F ». A
set F such that Imf ⊂ F
(i.e. ∀x ∈ Dom f, f(x)
∈ F), is called a target set of f.
The more precise (Fnc(f) ∧
Dom f = E ∧ Im f
= F) will be denoted f:E ↠ F (f
is a surjection, or surjective function from E
to F, or function from E onto F).
The empty set ∅ is the only set without element, and is
included in any set E (∅ ⊂ E).
Thus, (E=∅ ⇔ E ⊂ ∅ ⇔ ∀x∈E, 0), and
thus (E ≠ ∅ ⇔ ∃x∈E,1).
This constant symbol ∅ ensures the existence of a set; for any set
E we also get ∅ = {x∈E  0}.
That a variable may have empty range, is no obstacle for fixing it. In particular,
developing an inconsistent theory means studying a fixed system whose range
of possibilities is empty. We may actually need to do so in order to discover a
contradiction there, which means to prove that no such system exists.
As (Dom f = ∅ ⇔ Im f = ∅) and
(Dom f = Dom g = ∅ ⇒ f = g), the only
function with domain ∅ is called the empty function.
We can redefine ∃ from the above in two ways: (∃x∈E,
A(x)) ⇔ {x∈E  A(x)} ≠ ∅
⇔ (1 ∈ Im(E∋x ↦ A(x))).
For all x, {x,x} = {x}. Such a set
with only one element is called a singleton.
For all x, y, {x, y} = {y,
x}. If x ≠ y, the set {x, y}
with 2 elements x and y is called a pair.
Proofs of some statements from Part 1.
Equivalence of axioms for functions (1.10)
Recalling the axioms:
Fc. ∀t,E, Fnc (E ∋ x ↦ t(x))
1. ∀t,E, ∀_{Fnc} f, f = (E
∋ x ↦ t(x)) ⇔ (Dom f = E ∧
(∀x∈E, f(x) = t(x)))
1'. ∀t,E, ∀f, f = (E ∋ x ↦ t(x))
⇔ (Fnc f ∧ Dom f = E ∧ ∀x∈E, f(x) = t(x))
2. ∀t,E, Dom(E ∋ x ↦ t(x)) = E
∧ ∀y∈E, (E ∋ x ↦ t(x))(y) = t(y)
(=_{Fnc}) ∀_{Fnc} f, ∀_{Fnc} g,
(Dom f = Dom g ∧ ∀x∈Dom f, f(x)
= g(x)) ⇒ f = g
Splitting the ⇔ in 1. as a conjuction of implications, (1. ⇔ (1a. ∧ 1b.)) where
1a. ∀t,E, ∀_{Fnc} f, f = (E
∋ x ↦ t(x)) ⇒ (Dom f = E ∧
(∀x∈E, f(x) = t(x)))
1b. ∀t,E, ∀_{Fnc} f, (Dom f = E ∧
(∀x∈E, f(x) = t(x))) ⇒
f = (E ∋ x ↦ t(x))
Similarly (1'. ⇔ (1a'. ∧ 1b.)) where
1a'. ∀t,E, ∀ f, f = (E
∋ x ↦ t(x)) ⇒ (Fnc f ∧ Dom f = E ∧
(∀x∈E, f(x) = t(x)))
We can see immediately that 1a'. ⇔ (Fc. ∧ 2.) ⇒ 1a.
(We may write 2. ⇒ 1a. if not careful about definiteness).
Conversely, we can see (1a. ∧ Fc.) ⇒ 2.
Proof of 1b. ⇒ (=_{Fnc})
∀_{Fnc} f, ∀_{Fnc} g,
(Dom f = Dom g ∧ ∀x∈Dom f, f(x)
= g(x)) ⇒ ∃_{Set}E,
Dom f = E = Dom g ∧ (∀x∈E, f(x) = f(x)
= g(x)) ∴ f = (E
∋ x ↦ f(x)) = g.
Proof of (2.∧(=_{Fnc})) ⇒ 1b. : ∀t,E, ∀_{Fnc} f,
(Dom f = E ∧
(∀x∈E, f(x) = t(x)))
⇒ (Dom f =
Dom(E ∋ x ↦ t(x)) ∧ (∀y∈E, f(y) =
(E ∋ x ↦ t(x))(y)))
⇒
f = (E ∋ x ↦ t(x))
Equivalence between expressions of the hypothesis of the set generation principle (1.11)
These 3 properties are already known consequences of «Q=∀_{C
}». Conversely,
((2) ∧ (3)) ⇒ ((∀_{C }x, A(x))
⇒ Qx,A(x))
((1) ∧ ∃_{C }x, A(x)) ⇒ ∃y,
(Q*x, x=y) ∧ (∀x, x=y
⇒ A(x)) ∴ ((3) ⇒ Q*x,A(x))
∎
Philosophical aspects of the foundations of
mathematics
To complete our initiation
to the foundations of mathematics, the following pages of
philosophical complements (from this one to Concepts of truth in
mathematics), will present an overview on some of the deepest
features of the foundations of mathematics: their philosophical and
intuitive aspects (much of which may be implicitly understood but
not well explained by specialists, as such philosophical issues are
not easily seen as proper objects of scientific works). This
includes
 How, while independent of our time, the universe of
mathematics is still subject to a flow of its own time ;
 The deep meaning of the difference between sets and classes,
in relation to that time; thus, the deep reason for the use of bounded
quantifiers in set theory.
 The justification of the set
generation principle
These things are not necessary for Part 2 (Set Theory, continued)
except to explain the deep meaning and consequences of the fact that
the set exponentiation or power set (2.6) is not justifiable by the
set generation principle. But they will be developed and justified
in more details in Part 4 (Model Theory).
Intuitive representation and abstraction
Though mathematical systems «exist» independently of any particular
sensation, we need to represent them in some way (in words, formulas
or drawings). Diverse ways can be used, that may be equivalent
(giving the same results) but with diverse degrees of relevance
(efficiency) that may depend on purposes. Ideas usually first appear
as more or less visual intuitions, then are expressed as formulas
and literal sentences for careful checking, processing and
communication. To be freed from the limits of a specific form of
representation, the way is to develop other forms of representation,
and exercise to translate concepts between them. The mathematical
adventure itself is full of plays of conversions between forms of
representation.
Platonism vs Formalism
In this diversity of approaches to mathematics (or each theory), two
philosophical views are traditionally distinguished.
 The Platonic view (also called idealistic)
focuses on the worlds or systems to study, seen as preexisting
mathematical realities to be explored (or remembered, according
to Plato). This is the approach of intuition that smells the
global order of things before formalizing them.
 The formalistic view focuses on language, rigor
(syntactic rules) and dynamical aspects of a theory, starting
from its foundation (formal expression), and following the rules
of development.
Philosophers usually present them as opposite, incompatible belief
systems, candidate truths on the real nature of mathematics.
However, both views instead turn out to be necessary and
complementary aspects of math foundations. Let us explain how.
Of course, human thought having no infinite abilities, cannot
fully operate in any realistic way, but only in a way roughly
equivalent to formal reasonings developed from some foundations ;
this work of formalization can prevent the possible errors of
intuition.
But a purely formalistic view cannot hold either because
 The clarity and selfsufficiency of any possible foundation
(any starting position with formal development rules), remain
relative: any starting point had to be chosen more or less
arbitrarily, taken from and motivated by a larger perspective
over mathematical realities; it must be defined in some
intuitive, presumably meaningful way, implicitly admitting its
own foundation, since any try to specify the latter would lead
to a path of endless regression, whose realistic preexistence
would have to be admitted.
 Most of the time, works are only partially formalized: we use
semiformal proofs, with just enough rigor to give the feeling
that a full formalization is possible, yet not fully written; an
intuitive vision of a problem may seem clearer than a formal
argument.
Another reason for their reconciliation, is that they are not in any
global dispute to describe the whole of mathematics, but their
shares of relevance depends on specific theories under
consideration.
Realistic vs. axiomatic theories in mathematics and other
sciences
Interpretations of the word «theory» may vary between mathematical
and nonmathematical uses (in ordinary language and other sciences),
in two ways.
Theories may differ by their object and nature:
 Pure mathematical theories, are mathematical theories
considered for the pure sake of mathematics, without any
nonmathematical intentions.
On the contrary, theories outside
mathematics, try to describe some real systems (fields of
observation, parts of the outside world, that are not purely
mathematical systems). They may be of 2 sorts:
 Applied mathematical theories are also mathematical
theories (i.e. expressed in rigorous ways) but the mathematical
systems they describe are conceived as idealizations of aspects
of given realworld systems (neglecting other aspects); insofar
as it is accurate, this idealization (reduction to mathematics)
also allows for correct deductions within accepted margins of
error.
 Nonmathematical theories describe qualitative
(nonmathematical) aspects of the world. For example, usual
descriptions of chemistry involve drastic approximations,
recollecting from observations some seemingly arbitrary effects
whose deduction from quantum physics is usually out of reach of
direct calculations.
Theories may also differ by whether Platonism or formalism best
describes their intended meaning :
A realistic theory aims to
describe a fixed system given from an independent reality,
so that any of its ground formulas (statements) will be either
definitely true or definitely false as determined by this system
(but the truth of a nonmathematical statement may be ambiguous,
i.e. illdefined for the given real system). From this intention,
the theory will be built by providing an initial list of
formulas called axioms : that is a hopefully true
description of the intended system as currently known or guessed.
Thus, the theory will be true if all its axioms are indeed true on
the intended system. In this case, its logical consequences
(theorems, deduced from axioms) will also be true on the intended
model.
This is usually well ensured in pure maths, but may be speculative
in other fields. In realistic theories outside pure mathematics,
the intended reality is usually a contingent one among alternative
possibilities, that (in applied mathematics) are equally possible
from a purely mathematical viewpoint. If a theory (axioms list)
does not fit a specific reality that pure mathematics cannot
suffice to identify, this can be hopefully discovered by comparing
its predictions (logical consequences) with observations : the
theory is called falsifiable.
An axiomatic theory is a theory given with an axioms
list that means to define the selection of its models
(systems it describes), as the class of all systems where these
axioms are true. This may keep an unlimited diversity of models,
that remain equally real and legitimate interpretations. By this
definition of what «model» means, the truth of the axioms of the
theory is automatic on each model (it holds by definition and is
thus not questionable). All theorems (deduced from axioms) are
also true in each model.
In pure mathematics, the usual features of both possible roles of
theories (realistic and axiomatic) automatically protect them from
the risk to be «false» as long as the formal rules are respected.
Nonrealistic theories outside pure mathematics (where the
requirement of truth of theorems is not always strict, so that the
concept of axiom loses precision) may either describe classes of
real systems, or be works of fiction describing imaginary or
possible future systems. But this distinction between real and
imaginary systems does not exist in pure mathematics, where all
possible systems are equally real. Thus, axiomatic theories of pure
mathematics aim to describe a mathematical reality that is existing
(if the theory is consistent) but generally not unique.
Different models may be nonequivalent, in the sense that
undecidable formulas may be true or false depending on the model.
Different consistent theories may «disagree» without conflict, by
being all true descriptions of different systems, that may equally
«exist» in a mathematical sense without any issue of «where they
are».
For example Euclidean geometry, in its role of first physical
theory, is but one in a landscape of diverse geometries that are
equally legitimate for mathematics, and the real physical space is
more accurately described by the nonEuclidean geometry of General
Relativity. Similarly, biology is relative to a huge number of
random choices silently accumulated by Nature on Earth during
billions of years.
Realistic and axiomatic theories both appear in pure mathematics, in
different parts of the foundations of mathematics, as will be
presented in the section on the truth concepts in
mathematics.
But let us first explain the presence of a purely mathematical flow
of time (independent of our time) in model theory and set theory.
Time in model theory
The time order between interpretations of expressions
Given a model, expressions do not receive their interpretations all
at once, but only the ones after the others, because these
interpretations depend on each other, thus must be calculated after
each other. This time order of interpretation between expressions,
follows the hierarchical order from subexpressions to expressions
containing
them.
Take for example, the formula xy+x=3. In order for it
to make sense, the variables x and y must take a
value first. Then, xy takes a value, obtained by multiplying
the values of x and y. Then, xy+x
takes a value based on the previous ones. Then, the whole formula
(xy+x=3) takes a Boolean value (true or false).
But this value depends on those of the free variables x and
y. Finally, taking for example the ground formula ∀x,
∃y, xy+x=3, its Boolean value (which is false
in the world of real numbers), «is calculated from» those taken by
the previous formula for all possible values of x and y,
and therefore comes after them.
A finite list of formulas in a theory may be interpreted by a single
big formula containing them all. This only requires to successively
integrate (or describe) all individual formulas from the list in the
big one, with no need to represent formulas as objects (values of a
variable). This big formula comes (is interpreted) after them all,
but still belongs to the same theory. But for only one formula to
describe the interpretation of an infinity of formulas (such as all
possible formulas, handled as values of a variable), would require
to switch to the framework of onemodel theory.
The metaphor of the usual time
I can speak of «what I told about at that time»: it has a sense if
that past saying had one, as I got that meaning and I remember it.
But mentioning «what I mean», would not itself inform on what it is,
as it might be anything, and becomes absurd in a phrase that
modifies or contradicts this meaning («the opposite of what I'm
saying»). Mentioning «what I will mention tomorrow», even if I knew
what I will say, would not suffice to already provide its meaning
either: in case I will mention «what I told about yesterday» (thus
now) it would make a vicious circle; but even if the form of my
future saying
ensured that its meaning will exist tomorrow, this would still not
provide it today. I might try to speculate on it, but the actual
meaning of future statements will only arise once actually expressed
in context. By lack of interest to describe phrases without their
meaning, we should rather restrict our study to past expressions,
while just "living" the present ones and ignoring future ones.
So, my current universe of the past that I can describe today,
includes the one of yesterday, but also my yesterday's comments
about it and their meaning. I can thus describe today things outside
the universe I could describe yesterday. Meanwhile I neither learned
to speak Martian nor acquired a new transcendental intelligence, but
the same language applies to a broader universe with new objects. As
these new objects are of the same kinds as the old ones, my universe
of today may look similar to that of yesterday; but from one
universe to another, the same expressions can take different
meanings.
Like historians, mathematical theories can only «at every given
time» describe a universe of past mathematical objects, while this
interpretation itself «happens» in a mathematical present outside
this universe.
Even if describing «the universe of all mathematical objects»
(model of set theory), means describing everything, this
«everything» that is described, is only at any time the current
universe, the one of our past ; our act of interpreting
expressions there, forms our present beyond this past. And then,
describing our previous act of description, means adding to this
previous description (this «everything» described) something else
beyond it.
The infinite time between models
As a onemodel
theory T' describes a theory T with a model M,
the components (notions and structures)
of the model [T,M] of T', actually fall into 3
categories:
 The components of T and its developments as a formal
system (abstract types, structure symbols, expressions, axioms,
proofs from axioms), that aim to describe the model but remain
outside it and independent of it.
 The components of M (interpretations of types and
structure symbols)
 The interpretation (attribution of values) of all expressions
of T in M, for any values of their free
variables.
This last part of [T,M] is a mathematical construction
determined by the combination of both systems T and M
but it is not directly contained in them : it is built after
them.
So, the model [T,M] of T', encompassing the
present theory T with the interpretation of all its formulas
in the present universe M of past objects, is the next universe
of the past, which will come as the infinity of all current
interpretations (in M) of formulas of T will become
past.
Or can it be otherwise ? Would it be possible for a theory T
to express or simulate the notion of its own formulas and
compute their values ?
Truth undefinability
As explained
in 1.7, some theories (such as model theory, and set theory
from which it can be developed) are actually able to describe
themselves: they can describe in each model a system looking like
a copy of the same theory, with a notion of "all its formulas"
(including objects that are copies of its own formulas). However
then, according to the Truth Undefinability
theorem, no single formula (invariant predicate) can ever
give the correct boolean values to all object copies of ground
formulas, in conformity with the values of these formulas in the
same model.
A
strong and rigorous proof will be given later. Here is an easy one.
The Berry paradox
This famous paradox is the idea of "defining" a natural number n
as "the smallest number not definable in less than twenty words".
This would define in 10 words a number... not definable in less than
20 words. But this does not bring a contradiction in mathematics
because it is not a mathematical definition. By making it more
precise, we can form a simple proof of the truth undefinability
theorem (but not a fully rigorous one):
Let us assume a fixed choice of a theory T describing a set
ℕ of natural numbers as part of its model M.
Let H be the set of formulas of T with one free
variable intended to range over this ℕ, and shorter than (for example)
1000 occurrences of symbols (taken from the finite list of symbols
of T, logical symbols and variables).
Consider the formula of T' with one free variable n
ranging over ℕ, expressed as
∀F∈H, F(n) ⇒
(∃k<n, F(k))
This formula cannot be false on more than one number per formula in
H, which are only finitely many (an explicit bound of their number
can be found). Thus it must be true on some numbers.
If it was equivalent to some formula B∈H, we would get
∀n∈ℕ, B(n) ⇔ (∀F∈H,
F(n) ⇒ (∃k<n, F(k)))
⇒ (∃k<n, B(k))
contradicting the existence of a smallest n on which B is true.
The number 1000 was picked in case translating this formula into
T was complicated, ending up in a big formula B, but still in
H. If it was so complicated that 1000 symbols didn't suffice, we could
try this reasoning starting from a higher number. Since the existence of an
equivalent formula in H would anyway lead to a contradiction,
no number we might pick can ever suffice to find one. This shows the
impossibility to translate such formulas of T' into equivalent formulas of
T, by any method much more efficient than the kind of mere
enumeration suggested above.
This infinite time between theories, will develop as an endless
hierarchy of infinities.
Achilles runs after a turtle; whenever he crosses the distance to
it, the turtle takes a new length ahead.
Seen from a height, a vehicle gone on a horizontal road approaches
the horizon.
Particles are sent in accelerators closer and closer to the speed of
light.
Can they reach their ends ?
Each example can be seen in two ways:
 the «closed» view, sees a reachable end;
 the «open» view ignores this end, but only sees the movement
towards it, never reaching it.
And in each example, a physical measure of the «cost» to
approach and eventually reach the targeted end, decides its «true»
interpretation, according to whether this cost would be finite or
infinite, which may differ from the first guess of a naive
observer.
But the world of mathematics, free from all physical costs and
where objects only play conventional roles, can accept both
interpretations.
Each generic theory is «closed», as it can see its model (the ranges
of its variables) as a whole (that is a set in its set theoretical
formulation): by its use of binders over types (or classes), it
«reaches the end» of its model, and thus sees it as «closed». But
any possible framework for it (onemodel theory and/or set theory)
escapes this whole.
As explained in 1.7, set theory has multiple possible models : from
the study of a given universe of sets, we can switch to that of a
larger one with more sets (that we called metasets), and new
functions between the new sets.
As this can be repeated endlessly, we need an «open» theory
integrating each universe described by a theory, as a part (past) of
a later universe, forming an endless sequence of growing realities,
with no view of any definite totality. This role of an open theory
will be played by set theory itself, with the way its expressions only bind
variables on sets (1.8).
Time in set theory
The expansion of the set theoretical universe
Given two universes U⊂U', the universe U will
be called standard in U', or a subuniverse of U',
if its interpretation of set theoretical structures (values they give for all values
of arguments inside U) coincides with their metainterpretation (that of
U'). Precisely, let us require the preservation of the following data:
 The content of every set (every set coincides with the metaset of the
same elements): not only ∈ is preserved, but every set in U is
interpreted by U' as included in U (this is not a direct
consequence: if not requiring the subuniverse to be a set nor a class,
a counterexample would be given by internal set theory).
Therefore the set
builder is also preserved (then translated in 1.11 into an
infinite list of operator symbols in firstorder logic).
 The
function evaluator and the domain functor (1.4). Therefore the function
definer is also preserved (translated in 1.10 into an infinite
list of operator symbols).
 Tuples (2.1.)
 The powerset (2.5).
 Thus also, any other structure defined from the above by expressions and
bounded formulas, such
as finiteness (definable
from the power set): there actually exist nonstandard universes with a
different interpretation of finiteness, thus having as «set of integers» a nonstandard
model of arithmetic.
Thus, as structures in U' are fixed, U only needs to be
specified as a metasubset or class in U'. Let us call it a small
subuniverse if it is more precisely a set (U∈U').
If we have 3 universes U ⊂ U' ⊂ U"
where U' is standard in U", then we
have the equivalence:
(U is standard in U') ⇔ (U
is standard in U").
Thus the idea to consider the standardness of a universe as an
absolute property, independent of the external universe in which it
is described... provided that this external universe is itself standard.
This does not formally define standardness as an absolute concept
(which is impossible), but suggests that such a concept
may make ideal sense.
Let us call standard multiverse any collection (range) of
standard universes, where any 2 of them are small subuniverses
of a third one. Let us say that the set theoretical universe
expands when it ranges over a standard multiverse.
From any standard multiverse M, we can rebuild an
external universe containing all its universes, defined as their union
U=⋃M, where they are all standard. Indeed, any
expression with free variables in U takes its meaning from at
least one U∈M containing the values of all these
variables, and thus where the expression can be interpreted. This
U is still another specific standard universe, but it cannot
belong itself to M, as its presence there would contradict the
concept of multiverse which does not admit any biggest element. So,
no single standard multiverse can ever contain all possible standard
universes.
We can understand the intended sense of set theory as of a different
kind than that of generic theories, in this way :
 Interpretations of a generic theory require to fix a model, in
which its variables and expressions may take values;
 Set theory aims to locally give to its expressions with fixed values
of the free variables, their standard interpretation, independently of
a standard universe containing them, which may keep expanding.
Unlike standard universes, not all nonstandard universes can be small
subuniverses of some larger universe (itself nonstandard),
as extensions may be unable to preserve the power set operators.
We may also have a multiverse of nonstandard universes looking like
a standard multiverse (its members are small subuniverses of each
other), but whose union U (with the same structures, letting
these universes appear standard), no more behaves as
a good universe, as it no more satisfies the axiom
schema of specification over formulas with open
quantifiers (which is a necessary quality for a universe to be a small
subuniverse of another one).
Namely, there may be a set E∈U
and a formula A such that {x∈E∀_{U} y,
A(x,y)}∉U.
Two universes will be called compatible if they can both
be seen as subuniverses of a common larger universe. All
standard universes are compatible with each other. So when 2
universes are incompatible, at least one of them is nonstandard ;
they may be both parts of a common larger universe, only by
representing there at least one of them as nonstandard.
Can a set contain itself ?
Let us call a set reflexive if it contains itself.
By the proof
of Russell's paradox [1.8], the class (Set(x) ∧ x
∉ x) of nonreflexive sets, cannot coincide with a subset F
of any set E, since this F would then be a nonreflexive set
outside E, giving a contradiction. But can
reflexive sets exist ? this is undecidable; here is why.
From a universe containing reflexive sets, we can just remove them all:
these sets cannot be rebuilt from the data of their elements
(since each one has at least an element removed from the universe,
namely itself), but for the rest to still constitute a universe
(model of set theory), we need to manage the case of all other sets
that contained one (and similarly with functions):
 Either reduce them to pure elements
 Or eliminate them too (and so on for other sets that contained
the latter).
Another way is to progressively rebuild the universe while avoiding them:
each set appears at some time, formed as a collection of previously
accepted or formed objects. Any set formed this way, must have had
a first coming time: it could not be available yet when it first
came as a collection of already available objects, thus it cannot be reflexive.
As the reflexivity of a set is independent of context, a union of
universes each devoid of reflexive sets, cannot contain one either.
On the other hand, universes with reflexive sets can be created as
follows:
Riddle. What is the difference between
 a universe with a pure element x and a set y
such that x ∈ y ∧ y ∉ y,
 and a universe with a set x and a pure element y
such that x ∈ x ∧ y ∉ x ?
Answer: the role of the set containing x but not y,
played by y in the former universe, is played by x
in the latter.
The absence of reflexive sets, is a special case of the axiom of
foundation (or regularity), to which the above arguments of
undecidability, will naturally extend. Its formulation will be based
on the concept of well founded
relation. But this axiom is just as
useless as the sets it excludes.
When the universe expands, the values of statements (firstorder
formulas, admitting open quantifiers) may change.
Of course, if a statement is formally provable from given axioms
then it remains true in all universes satisfying these axioms;
similarly if it is refutable (i.e. its negation is provable, so
that it is false in all universes). But set theory does not give sense
(a Boolean value) to undecidable ground
statements, and similar nonground statements (with open quantifiers
and given values of free variables), as any given value would be relative
to how things go «here and now» : if a universal statement (∀x,
A(x) for a bounded formula A) is true «here», it
might still become false (an x where A is false might be
found) «elsewhere».
But if the value of an indefinite statement is relative to how things go
«here», then the actual variability of this value between places (to
motivate its indefiniteness status) remains relative to how things turn
out to go «elsewhere». Namely, it is relative to a given range of
possible coexisting «places» (universes) where the statement may
be interpreted, that is a multiverse. But to coexist, these universes
need the framework of a common larger universe U containing
them all. In fact, the mere data of U suffices to essentially
define a multiverse as that of «all universes contained in U».
Or rather, 2 multiverses,
depending on whether we admit all universes it contains, or only the
standard ones.
A standard multiverse, as defined above. There, the
variability of an existential statement (∃x, A(x))
for a bounded formula A, means the existence of universes
U, U' ⊂U such that ∀x∈U,
¬A(x) but ∃x∈U', A(x).
That is, A(x) only holds at some x outside
U. We can get a U' such that U∈U'
by taking any universe containing both U and the old U'.
In particular, (∃x, A(x)) is also true in U
(we may call this statement «ultimately true»). Intuitively, the
x where A is true are out of reach of the theory : they
cannot be formally expressed by terms, and their existence cannot be
deduced from the given existence axioms (satisfied by U).
But since (∃x, A(x)) was not definitely true
for the initially considered universe U actually unknown and
expanding, its chances may be poor to become definitely true for U
which is just another axiomatically described universe, that is
unknown as well. So, when a statement aimed for U is indefinite, it
may be varying when U expands, but it may also be that the
very question whether it indeed varies (that can be translated as a
question on U), remains itself an indefinite question as
well. Just more truths can be determined for U than for U
by giving more axioms to describe U than we gave for U.
The incompleteness
theorem will imply that a formalization of this description of U
(as the union of a standard multiverse, whose universes satisfy given axioms)
is already such a stronger axiomatization, but also that neither this nor any
other axiomatic theory trying to describe U (as some kind of
ultimate standard universe), can ever decide (prove or refute) all ground
statements in U; in particular, the question of the variability of a ground
statement in the expanding U cannot be always decided either.
A multiverse of «all universes» no matter if they are
standard or not. The completeness
theorem will show that for any generic theory, the interpretation
of the indefiniteness of a ground formula as variability of its boolean
value in this multiverse, coincides with their formal
undecidability. Strange things may happen there for an undecidable
(∃x,A(x)), as the universes where it is true
and those where it is false may be incompatible :
 A universe U where it is true might not contain any of those
where it is false as a subuniverse : {x∈U
¬A(x)} may not have any metasubset behaving as a
good universe;
 A universe where it is false might not be a small subuniverse
of any another one where it is true.
 It might be only true in nonstandard universes.
Intuitively, different possible universes with different properties
do not necessarily "follow each other" in time, but can belong to
separate and incompatible growth paths, some of which may
be considered more realistic than others.
So, while the formal undecidability of a ground statement makes
it automatically variable in any "multiverse of all universes", this still
does not say how it goes for standard multiverses. In conclusion,
the indefiniteness of statements should only be treated by avoidance,
as a mere expression of ignorance towards the range of acceptable
universes, partially selected by axioms, where they may be interpreted.
Interpretation of classes
Classes in an expanding universe
Unlike objects which can be compared by an equality symbol (that can be
used in formulas), the metarelation of equality between classes is as
indefinite as the open ∀ since both concepts are definable from each other:
 The equality of classes A and B would be
defined by ∀x, A(x) ⇔ B(x);
 The statement (∀x, A(x))
means A = 1 (the universe, class of all objects).
Like with open quantifiers, this indefiniteness leaves us with both concepts
of provable equality (or proven equality), and provable inequality,
according to whether the statement of this equality (∀x,
A(x) ⇔ B(x)) is provable or refutable.
Each universe U interprets each class C as a metaset of
objects P={x∈U C(x)}, and sees it
as a set when P ∈ U. This condition, that C has
the same elements in U as an object (set) P also in
U, is expressed by set theory in U as
∃P,∀x, C(x) ⇔ x ∈ P
or equivalently∃E,∀x, C(x)
⇒ x ∈ E
since from such an E we can restore
P as P={x∈E C(x)}.
Otherwise (if P ∉ U), this P is
arising to existence with U and will exist as a set in
future universes (those which see U as a set).
From the perspective of an expanding universe, a class
C (given as a formula with parameters), «is a set» (equal to P) if the
part P = {x∈U C(x)} that this formula
defines in each U (formally depending on U),
turns out to remain constant (the same set) during the expansion of U.
More precisely, it is known as a set
(proven equal to P) if we could prove this independence,
i.e. refute the possibility for any object x outside the current universe
(but existing in a larger universe),
to ever satisfy C(x). On the other hand, a class C is
not considered a set, if it remains eventually able to contain «unknown»
or «not yet existing» objects (in another universe), that would belong to some
future value of P, making P vary during the growth of U.
In a given expansion of U, the interpretation of the formal condition
for C to be a set (∃E,∀x, C(x) ⇒ x
∈ E) in the union U of these U,
means that in this growth, «there is a time after which P will
stay constant». Compared to our last criterion to distinguish sets among
classes in an expanding universe (the constancy of P), this one ignores any past
variations, to focus on the latest ones (those between the largest universes,
with size approaching that of the U where it is interpreted).
But the ideally intended standard multiverse, that is the range of U,
would have to be itself a class instead of a set. Thus, both perspectives
(a constant vs. a variable universe) alternatively encompass each other,
endlessly along the expansion. Meanwhile, a
class defined by some special formula may alternatively gain and
lose the status of set ; but if in some growth range,
«P perpetually alternates between variability and constancy», then it
would ultimately not be constant there, thus C would not be
a set. So the alternation of its status would end... if we stopped checking
it at wrong places. But how ?
Concrete examples
A set: Is there any dodo left on Mauritius ? As this
island is well known and regularly visited since their supposed
disappearance, no surviving dodos could still have gone unnoticed,
wherever they may hide. Having not found any, we can conclude
there are none. This question, expressed by a bounded quantifier,
has an effective sense and an observable answer.
A set resembling a class: Bertrand Russell raised this
argument about theology: «If I were to suggest that between
the Earth and Mars there is a china teapot revolving about the
sun..., nobody would be able to disprove my assertion [as] the
teapot is too small to be revealed even by our most powerful
telescopes. But if I were to go on to say that, since my assertion
cannot be disproved, it is intolerable presumption on the part of
human reason to doubt it, I should rightly be thought to be
talking nonsense.» The question is clear, but on a too large space,
making the answer practically inaccessible. (An 8m telescope has
a resolution power of 0.1 arcsec, that is 200m on the moon surface)
A class: the extended statement, «there is a teapot orbiting
some star in the universe» loses all meaning: not only the size of
the universe is unknown, but Relativity theory
sees the remote events from which we did not receive light yet, as
not having really happened for us yet either.
A metaobject: how could God «exist», if He is a
metaobject, while «existence» can only qualify
objects? Did apologists properly conceive their own thesis
on God's «existence» ? But what are the objects
of their faith and their worship ? Each monotheism rightly
accuses each other of only worshiping objects (sin of idolatry):
books, stories, beliefs, teachings, ideas, attitudes, feelings,
places, events, miracles, healings, errors, sufferings, diseases,
accidents, natural disasters (declared God's Will), hardly more
subtle than old statues, not seriously checking (by fear of God)
any hints of their supposed divinity.
A universal event: the redemptive sacrifice of the Son of
God. Whether it would have been theologically equivalent for it to
have taken place not on Earth but in another galaxy or in God's
plans for the Earth in year 3,456, remains unclear.
Another set reduced to a class... The class F of
girls remains incompletely represented by sets: the set of those
present at that place and day, those using this dating site and
whose parameters meet such and such criteria, etc. Consider the
predicates B of beauty in my taste, and C of
suitability of a relationship with me. When I try to explain that
«I can hardly find any pretty girl in my taste (and they are often
unavailable anyway)», i.e.
(∀_{F} x, C(x)
⇒ B(x)) ∧ {x∈ F  B(x)}≈Ø,
the common reaction is: «Do you think that beauty is the only thing
that matters ?», i.e.
What,(∀x ∈ F, C(x)
⇔ B(x)) ????
then «If you find a pretty girl but stupid or with bad character,
what will you do ?». Formally : (∃x ∈ F, B(x)
⇏ C(x) !!!). And to conclude with a claim of pure
goodness: «I am sure you will find !», ( « ∃ ∃ x ∈ F,
C(x)). Not forgetting the necessary
condition to achieve this: «You must change your way of thinking».
... by the absence of God...: F would have immediately
turned into a set by the existence of anybody on Earth able to
receive a message from God, as He would obviously have used this
chance to make him email me the address my future wife (or the other
way round).
... and of any substitute: a free, open and efficient online
dating system, as would be included in my project trustforum.net, could give the
same result. But this requires finding programmers willing to
implement it. But the class of programmers is not a set either,
especially as the purpose of the project would conflict with the
religious moral priority of saving God's job from competitors
so as to preserve His salary of praise.
Let Q* be the abbreviation as a quantifier symbol,
of a bounded formula that uses an extra unary predicate symbol.
Assume that ¬(Q*y,0), and let C(x)
defined as (Q*y, y = x).
The hypothesis of the set generation principle means that
we have a proof of (Q* ⇔ ∃_{C}).
Let E be the range of all values ever taken by the
argument y of R(y) when interpreting
Q*y, R(y). This range may depend
on implicit parameters of Q*y but does not
depend on R. It must be a set because this formula
only has definite, fixed means (variables bound to given sets,
fixed parameters) to provide these values. We may also take
as E another set that includes this range, such as any fixed
universe (seen as a set in a larger universe) containing the values
of all parameters, so that the formula can be interpreted there.
For any x, the value C(x)
of Q* on the predicate (y ↦ (y = x)),
can only differ (be true) from its (false) value on (y ↦ 0),
if both predicates differ inside E, i.e. if x
belongs to E : C(x) ⇔
((Q*y, y = x) ⇎ (Q*y,0))
⇒ (∃y∈E, y=x ⇎ 0) ⇔ x
∈ E
thus C is a set. ∎
For classes satisfying the condition of the set generation principle
(being indirectly as usable as sets in the role of domains of
quantifiers), are they also indirectly as usable as sets in the role
of domains of functions (before using this principle) ? Namely, is
there for each such class a fixed formalization (bounded formulas
with limited complexity) playing the roles of definers and
evaluators for functions having these classes as domains ? The
answer would be yes, but we shall not develop the justifications here.
Concepts of truth in mathematics
Let us review 4 distinct concepts of «truth» for a mathematical
formula, from the simplest to the most subtle.
We first saw the relative truth, that is the value of a
formula interpreted in a supposedly given model (like an implicit
free variable, ignoring any difficulty to specify any example). In
this sense, a given formula may be as well true or false depending
on the model, and on the values of its free variables there.
Provability
Then comes the quality of being relatively true in all models of a given
axiomatic theory, which coincides with provability in this theory,
i.e. deduction from its axioms by the rules of firstorder logic. Namely,
there are known formal systems of proof for firstorder
logic, with known proof verification algorithms, universally applicable to
any firstorder theory while keeping this quality (ability to prove exactly
all universally true formulas).
This equivalence between universal truth (in all models) and provability,
is ensured by the completeness
theorem, which was originally Gödel's
thesis. It gives to the concepts of «proof», «theorem» and «consistency»
(vague a priori concepts from intuition) perfectly suitable mathematical
definitions, and practically eliminates any object of disagreement between realism
(Platonism) and formalism for firstorder logic.
This remarkable property of firstorder logic, together with the
fact that all mathematics is expressible there (what is not
directly there can be made so by insertion into set theory, itself
formalized as a firstorder theory), gives this framework a
central importance in the foundations of mathematics.
The proof of the completeness theorem, first expressed as the
existence of a model of any consistent firstorder theory, goes
by constructing such models out of the infinite set of all
ground expressions in a language constructed from the theory
(the language of the theory plus more symbols extracted from its
axioms). As the set of all ground expressions in a language can
itself be constructed from this language together with the set ℕ
of natural numbers, the validity of this theorem only depends on the
axiom of infinity, that is the existence of ℕ as an actual infinity,
sufficient for all theories (ignoring the diversity of infinities in set
theory).
However, these are only theoretical properties, assuming a
computer with unlimited (potentially infinite) available time and
resources, able to find proofs of any size. Not only the precise size of
a proof may depend on the particular formalism, but even some
relatively simple formulas may only have «existing» proofs that «cannot be
found» in practice as they would be too long, even bigger than the number of atoms
in the visible physical Universe (as illustrated by Gödel's
speedup theorem).
Within limited resources, there may be no way to distinguish whether a
formula is truly unprovable or a proof has only not yet been found.
To include their case, the universal concept of provability (existence of a proof)
has to be defined in the abstract. Namely, it can be expressed as a formula of firstorder arithmetic
(the firstorder theory of natural numbers with operations of
addition and multiplication), made of one existential quantifier
that is unbounded in the sense of arithmetic (∃_{ℕ} p,
) where p is an encoding of the proof, and inside is a formula
where all quantifiers are bounded, i.e. with finite
range (∀x < (...), ...), expressing a verification of this proof.
However, once given an arithmetical formula known to be a correct
expression of the provability predicate (while all such formulas are
provably equivalent to each other), it still needs to be interpreted.
Arithmetic truths
This involves the third concept of mathematical
truth, that is the realistic
truth in firstorder arithmetic. This is the ideally meant
interpretation of arithmetic: the interpretation of ground formulas
of firstorder arithmetic in «the true set ℕ of all, and only all,
really finite natural numbers», called the standard model of
arithmetic. But any axiomatic formalization of arithmetic in
firstorder logic is incomplete, in both following senses of the question:
 Due to the incompleteness
theorem, the set of all real truths of arithmetic (formulas
true in ℕ) cannot be exhaustively produced by any algorithm (using
unlimited resources but a finite amount of initial information). Thus,
they cannot be all logical consequences of any algorithmically produced
axiomatic theory either (since logical deduction from given axioms is
algorithmic). This first incompleteness result can be easily deduced
from truth
undefinability as we already saw it : as provability is definable, it
cannot coincide with truth.
 Even when abstractly considering to take all real truths as axioms,
it still would not suffice to determine the model, as it cannot exclude
nonstandard
models. (This last form of incompleteness does not have any
name because it equally affects the firstorder description of any
infinite system, according to the LöwenheimSkolem
theorem, and is thus unremarkable.)
This incompleteness affects the provability predicate itself, though
only on one side, as follows.
On the one hand, if the formula p(A) of provability
of a formula A, is true, then it is provable: a proof of p(A)
can in principle be produced by the following method in 2
steps:
 Find a proof of A (as it is assumed to exist);
 Process it by some automatic converter able to formally
convert any proof of A into a proof that a proof of A
exists.
On the other hand, it is not always refutable when false : no matter
the time spent seeking in vain a proof of a given unprovable formula,
we might still never be able to formally refute the possibility to
finally find a proof by searching longer, because of the risk for a formula
to be only provable by unreasonably long proofs.
In lack of any possible fixed ultimate algorithm to produce all truths of
arithmetic, we can be interested with partial solutions: algorithms
producing endless lists of ground arithmetic formulas with both qualities
 Infallible (stays included in the set of truths, describing
the standard model of arithmetic);
 Large (compared by inclusion with other algorithmically
producible sets of truths).
A natural method to progress in the endless (nonalgorithmic) search
for better and better algorithms for the second quality without breaking
the first, consists in developing formalizations of set theory describing
larger and larger universes beyond the infinity of ℕ, where properties
of ℕ can be deduced as particular cases. Indeed, if a set theory T'
requires its universe to contain, as a set, a model U of a
set theory T, then the arithmetic formula of the consistency
of T will be provable in T' but not in T,
while all arithmetic theorems of T remain provable in T'
if T' describes U as standard.
Set theoretical truths
The above can be read as an indispensability argument for our last
concept of truth, which is the truth of set theoretical statements.
To progress beyond logical deduction from already accepted ones, more
set theoretical axioms need to be introduced, motivated by some
Platonist arguments for a real existence of some standard universes where they
are true; the validity of such arguments needs to be assessed in intuitive,
not purely formal ways, precisely in order to do better than any predefined
algorithm. Arguments for a given axiomatic set theory, lead to
arithmetic conclusions :
 The statement of the formal consistency of this set theory;
 The arithmetic theorems we can deduce in its framework
Both conclusions should not be confused :
 By the second
incompleteness theorem, 1. cannot come as a particular case of 2.
(unless it is wrong) even though it must be true for 2. to be consistent and thus
of any interest. We can already see how the naive attempt to deduce 1. from 2.
fails : consistency is logically deducible from the existence of a model, but
the truth undefinability prevents the theory from using the current model as an
example of existing model where to interpret its formulas and thus witness their consistency.
 the reason for the truth of 2. refers to the existence of a
standard model of this theory, while 1. only means that
nonstandard models exist.
But as the objects of these conclusions are mere properties of finite systems
(proofs), their meaning stays unaffected by any ontological assumptions
about infinities, including the finitist ontology (denying the reality of any
actual infinity, whatever such a philosophy might mean). It sounds hard to
figure out, then, how their reliability can be meaningfully challenged by
philosophical disputes on the «reality» of abstractions beyond them
(universes), just because they were motivated by these abstractions.
But then, the statement of consistency (1.), with the mere
existence of ℕ, suffice to let models of this theory really
exist (nonstandard ones, but working the same as standard ones).
For logical deduction from set theoretical axioms to be a good arithmetic truth
searching algorithm, these axioms must be :
 Sound = keeps accepting some standard universes (to keep the
output infallible);
 Strong = rejects "small" standard universes = sets a high
minimum "size" on the hierarchy of its subuniverses
(to make the output large).
But for a collection of such axioms to keep these qualities when put
together in a common theory, they need to be compatible,
in the sense that their conjunction remains sound. Two such statements
might be incompatible, either if one of them limits the size of the
universe (thus it shouldn't), or if each statement (using both kinds of open
quantifiers when written in prenex
form) endlessly alternates between truth and falsity when the
universe expands, in such a way that they would no more be true together in any
standard universe beyond a certain size (their conjunction must not
limit the size of the universe either). The question is, on what sort of big standard
universes might good axioms more naturally be true together ?
A standard universe U' might be axiomatically described
as very big by setting it a little bigger than another very big
one U, but the size of this U would need a
different description (as it cannot be proven to satisfy the same
axioms as U' without contradiction), but of what kind ?
Describing U as also a little
bigger than a third universe and so on, would require the axioms to keep
track of successive differences. This would rapidly run into inefficient
complications with incompatible alternatives, with no precise reason to
prefer one version against others.
The natural solution, both for philosophical elegance and the efficiency
and compatibility of axioms, is to focus on the opposite case, of universes
described as big by how much bigger they are than any smaller one (like how
we conceived
a ultimate universe as the union of a standard multiverse) :
axioms must be
 Open
= satisfied by any universe where eternity is a sufficiently long time
especially towards the end (= open universes).
It is also convenient because such descriptions are indeed
expressible by axioms interpreted inside the universe, with no
need of any external object. Indeed, if a property was only
expressible using an external object (regarding this universe as a
set), we could replace it by describing instead our universe as
containing a subuniverse of this kind (without limiting its size
beyond it), and why not also endlessly many subuniverses of this
kind, forming a standard multiverse: stating that every object is
contained in such a subuniverse. This is axiomatically expressible
using objects outside each of these subuniverses, but inside our
big one; and such axioms will fit all 3 above qualities.
Finally, the properly understood meaning of set theory is neither
axiomatic nor realistic, but some fuzzy intermediate between
both: its axioms aim to approach all 3 qualities (strong and open but still
sound) selecting universes with the corresponding 3 qualities
(big and open but still standard), but these qualities are all fuzzy,
and any specific axioms list (resp. universe) only aims to approach them,
while this quest can never end.
Fortunately, rather simple set theories such as ZF, already satisfy these
qualities to a high degree, describing much larger realities than
usually needed. This is how a Platonic view of set theory (seeing the
universe of all mathematical objects as a fixed, exhaustive reality) can
work as a good approximation, though it cannot be an exact, absolute fact.
Alternative logical frameworks
The description we made of the foundations of mathematics
(firstorder logic and set theory), is essentially just an
equivalent clarified expression of the widely accepted ones (a
different introduction to the same mathematics). Other logical frameworks
already mentioned, to be developed later, are still in the "same family"
of "classical mathematics" as firstorder logic and set theory.
But other, more radically different frameworks
(concepts of logic and/or sets), called nonclassical
logic, might be considered. Examples:
 Some logicians developed the «intuitionistic logic», which
lets formulas keep a possible indefiniteness as we mentioned for
open quantifiers, but treated as a modification of the pure
Boolean logic (the rejection of the excluded middle, where ¬(¬A)
does not imply A), without any special mention of
quantifiers as the source of this uncertainty. Or it might be
seen as a formal confusion between truth and provability. In
this framework, {0}∪ ]0,1] ⊂ [0,1], without equality. I could
not personally find any interest in this formalism but only
heard that theoretical computer scientists found it useful.
 When studying measure
theory (which mathematically defines probabilities in
infinite sets), I was inspired to interpret its results as
simpler statements on another concept of set, with the following
intuitive property. Let x be a variable randomly
taken in [0,1], by successively flipping a coin for each of its
(infinity of) binary digits. Let E be the domain of x,
set of all random numbers in [0,1]. It is nonempty
because such random numbers can be produced. Now another
similarly random number, with the same range (y ∈ E)
but produced independently of x, does not have any
chance to be equal to x. Thus, ∀x ∈ E,∀y
∈ E, x ≠ y. This way, x ∈ E
is no more always equivalent to ∃y ∈ E, x
= y.
We will ignore such alternatives in all following sections.
Set theory and
foundations of mathematics
Next : 2. Set theory (continued)