Geometric expression of Markov
chains
(Classical probabilistic processes in discrete spaces)
Let us start with an elementary expression of the most general
case of a classical probabilistic law of evolution.
Such a law is expressed in a way that conforms to the following list
of features and conditions; it will apply to any material
system at any other time and place, provided it is known to start
inside the same list of initial states, and the amount of time
Dt and the external conditions are the same:
1) We have a list A of n possible states a among which some
material system at a given time t, is assumed to be;
2) After the amount of time Dt (so, at the later time t'= t+Dt), we
have another list B of m possible states b among
which the system will be.
3) To every a in A and
every b in B,
a positive real numbers p(a,b) is assigned.
4) These numbers satisfy the condition: for every a in A,
the sum of all p(a,b) among all b in B, is equal to 1.
(Thus, p(a,b) ≤1 for
every a,b).
5) The number p(a,b) is the
probability that the system will be in the state b at time t' in the case it
happened to be in the state a
at time t.
Note that the last of these features is fundamentally different from
the first four: the first four are purely mathematical claims, while
the last one is metaphysical.
Indeed there is something irreducibly metaphysical in the very
concept of probability, expressing something from the concious type of
existence that cannot be reduced to purely mathematical
concepts, even though it is a very incomplete account of the
concious type of existence.
The metaphysical nature of the concept of probability can be seen by
developing the different ideas contained in it:
- We are in a reality where only the state of the system at
time t is unique and fixed, but the state of the system at time
t' is not fixed, and several future states are still
"coexisting" (mathematically) as genuine possibilities in the
expression of the probabilistic evolution law.
- There will only be one reality of what the state of the system
at time t' will be; but this future reality is not real
yet.
This succession of different versions (contents) of reality along
time, comes from a feature of the concious type of existence as
opposed to the mathematical one (see overview of
metamathematics and metaphysical preliminaries
for the details). From the concious type of
existence, the concept of probability also inherits its
dissymmetry with respect to the exchange between past and
future.
You may have noted that, while purely mathematical, the above rule
4) is not symmetrical either; however, in the case of quantum
physics, this condition (or what it will be come in the context of
quantum theory) will be ensured by a time-symmetrical law.
Now, this concept of a probabilistic law of evolution is rather
awkward, by its way of mathematically ruling the features of a
"real" (beyond mere mathematical) type of existence for mathematical
objects.
It may be considered coherent from a logical viewpoint,
but does not fit all the non-mathematical aspects of the "real
existence" (concious) it tries to describe.
There are in fact two problems here:
- No mathematical law can ensure by itself the production of a
concious existence. Concious perceptions can be absent from the
location of the system at time t', therefore reducing the final
state to a mere mathematical status, made of a
superposition of the different possible states at time t',
rather than as a unique state; by such superpositions, the
physical system can keep evolving by itself until the final
result will eventually be perceived later.
- It hardly means anything to say that the evolution of a given
system has obeyed a given probability law. All what can
precisely be checked is whether the observed final state indeed
had a non-zero probability, as resulting from the initial state
according to a given law. Any other claim requires to check a
large number of repetitions of an experience, and to hope that
the unlikely event of large deviations of statistical
observations from "the true probability law" won't happen. (This
provides a starting concept for parapsychology, how mind
interacts with matter by making the evolution diverge from
mathematical probability laws.)
Now let us expand the first point further.
Probabilistic states
As a probabilistic superposition of states of a physical system can
be obtained as a still unobserved outcome of a probabilistic
evolution from a fixed initial state, we can consider further
evolutions of the system starting from such superpositions.
Finally, probabilistic combinations of states will be considered as
possible states of a physical system in their own right, aside the
list of "pure states" that we initially assumed to be the possible
states of the system. Then, predictions of results from these
evolutions will be considered as depending on these more general
cases of initial states, in a way that extends and is deducible from
the probabilistic law that governed the evolution out of "pure
states".
Let us represent such probabilistic states in a geometrical form,
starting from the simplest cases.
The simplest interesting case is made of the combinations of 2
available "pure" states we shall denote here X and Y. Probabilistic
combinations of them are represented by the real numbers p and p'
which are the probabilities for the system to be respectively in the
state X or Y. These numbers are positive and satisfy p+p'=1.
In the plane of coordinates (p,p'), this set of all states
combining of X and Y, is represented by the points of the segment
whose ends are the points (1,0) and (0,1), which can be identified
as X and Y themselves (a system that has probability 1 of being in
the state X, is in the state X). Any other point S of coordinates
(p,p') inside this segment, is the barycenter of the two points X
and Y with respective weights p and p'. In other words, for any
positive numbers p and p'=1-p, if you put a ball of mass p at point
X (1,0) and a ball of mass p' at point Y (0,1), then the gravity
center of the whole will be at the point of coordinates (p,p').
The same constructions can be done with any higher number of pure
states: with 3 states, the set of points of coordinates (p,p',p")
such that all p,p' and p" are positive and p+p'+p"=1, is a triangle.
With 4 states we obtain a tetrahedron.
The general concept of a classical probabilistic state of a system,
has the following properties:
- A finite natural number n is associated to the system, that is
the "number of possible states" of the system (when it is known
to satisfy some specified conditions...).
- This number n completely determines the geometric shape of the
set S of all states of the system. This is an (n−1)-simplex, with
dimension n−1.
- Some states are "pure", others are composite; pure states are those which cannot
be obtained as a barycenter of any list of several other states with nonzero coefficients
- The whole set S is made of all barycenters of pure states with
positive coefficients; in other words, S is a convex set;
- the set of pure
states is made of n different points which are the vertices of this simplex;
- For every state there is a unique list of pure states, of
which it is a combination (barycenter) with strictly positive
coefficients; in other words, such that it is in the interior of
the simplex with these pure states as vertices.
Recommanded readings : Barycentric
coordinate system - stochastic
matrix
Now, let's come back to the evolution of a physical system, from a
list A of n possible states to a list B of m possible states. How
does it operate on the probabilistic states ? The numbers
p(a,b) form the matrix of a linear transformation from the
n-dimensional space with coordinates labeled by initial pure states,
to the m-dimensional space with coordinates labeled by final pure
states.
This linear transformation sends the simplex in the first
space (defined by the equations: sum of coordinates = 1; all
coordinates ≥ 0) into the simplex defined the same way in the
final space.
Considering the affine spaces of respective dimensions n−1 and m-1,
each defined by the equation (sum of coordinates = 1) containing
these simplexes, this transformation is an affine transformation
from the one to the other; affine transformations are the
transformations that preserve barycenters.
Let's take an example : n=m=3.
Each of this simplex is a triangle; and the evolution defines an
affine transformation sending the first triangle into the second
triangle. The images of vertices of the first triangle are any 3
points inside the second triangle, and determine the whole
transformation.
The justfication for the preservation of barycenters can be
understood easily, as barycenters are a fundamental structure of
these spaces: for any points X,Y,Z in one such probability space, no
matter whether they are pure or not, any time we may need to
consider a system that has respective probabilities p,p',p" to be
either in state X,Y or Z, this can be summed up by saying that the
state of this system is the barycenter of X,Y,Z with weights
p,p',p".
Measurements
After having explained how the unobserved probabilistic
evolution of systems can be geometrically expressed as a
deterministic evolution of abstract points representing
probabilistic states, let us now describe the expressions of
concious observations of these states according to this geometric
representation.
Coming back to the definitions we started with, and considering a
system of n possible states, we can consider the case of a complete
measurement, with n possibilities of perceptions corresponding to
each of the n pure states of the system. So, if the system was in
one of the n pure states then the perception would be determined
with certainty; while it is undetermined with
specific probabilities to give one or another result if the
state of the system is not pure.
In practice, we usually don't have the chance to directly perceive
by conciousness the state of physical systems. Instead, we use
measurement apparatus that interact with the systems, then our body
interacts with the measurement apparatus (or if we consider the
direct vision of a system, then the eyes play the role of the
measurement apparatus...). Anyway, let us assume that the
measurement apparatus are convenient enough so that once the
measurement on the system is made, the result will be ready for
reading.
Thus we can describe the measurement process as a physical evolution
from the observed system into a set of possible final states of the
measurement apparatus. We already explained how physical evolution
takes place. Then the convenience of the measurement apparatus
consists in the fact that we can distinguish its pure states
from each other with certainty (measure its state completely in the
above sense). We forget for now how the measured system
becomes after measurement, and will come back to it later.
This measurement result does not give much of the respective
probabilities of the final states. Instead, it only specifies one of
this states. What information does this give about the initial
system ? It only gives a hint about the probability to have got
the result we got. This can be interpreted as an indirect
measurement of each of the previous intermediate states of the
system during its evolution before it was measured.
This is expressed in the form of a long matrix multiplication,
that gives the probability of a result as depending on the initial
state:
Proba = L M E' E S
where:
S = initial state (column)
E = matrix of a first evolution of the system
E' = matrix of a second evolution of the system (any number
successive evolutions can be inserted)
M = matrix of the measurement process (last evolution, into a final
state of the measurement apparatus)
L = row matrix expressing the perceived result of the measurement,
with one 1 and zeros, for example (0 1 0 0).
The matrix multiplication is associative:
Proba = L(M E' E S) = (L M)(E' E S) = (L M E') (E S) = (L M E'
E)S
So, the observation L of the state (M E' E S) of the measurement
apparatus, can be interpreted as the observation (L M) of the final
state (E' E S) of the system, or as the observation (L M E') of the
intermediate state (E S) of the system, or as the observation (L M
E' E) of the initial state S of the system.
While the evolution determines the successive probabilistic state of
any system from the past into the future, a final measurement
retrospectively provides measurements of previous states.
Each linear form defined by the row matrices L, L M, L M E' and L M
E' E takes values between 0 and 1 on the corresponding simplex. This
is equivalent to saying that all its coefficients are between 0 and
1 (these are the respective probabilities to obtain the result
starting from each pure state).
Finally, we have a sort of duality between probabilistic states and
perceptions (=results of measurements), where states go forward in
time while perception can be defined to apply retrospectively. While
they are very different in reality, they may be somehow considered
to be mathematically symmetric when exchanging the past and the
future, but... it depends. More precisely, this symmetry only
concerns perceptions not yet done (which are still in the future of
conciousness), when it still makes sense to wonder what will be the
probability of a result (because the probability of a result already
obtained is 1). When a measurement is done, the result becomes past
and modifies the state of the system.
Just like states, some possible perceptions are pure (with only one
nonzero component), giving full information on the system, while
others are impure (with several nonzero components).
This duality does not seem very symmetrical in the general case of
classical probabilistic evolutions we are studying here, and has
some problems.
For example, the previous states can be retroactively revised based
on the final measurement (However we should be careful that these
are not any concious retrocausalities, as previous perceptions
remain unchanged in concious memory).
But these revised states do not naturally evolve with certainty
towards this very measurement finally obtained.
However, we will see that in some aspects (but not all
aspects), quantum theory is more symmetrical about this.
The concious (metaphysical) time should not be confused with the
physical time. The concious concepts of "before" and "after" a
measurement, mean before / after we know what is the result of
the measurement, and do not always fit with the physical time when
an apparatus interacts with a system to measure it. In the same way,
concious perception should not be confused with physical perception
defined as a measurement by physical interaction with a measuring
apparatus).
Non-disturbing measurements
We first introduced impure perceptions retroactively in a way that
destroys the system. But the same sort of impure perceptions
can be operated by interaction with a measuring apparatus without
disturbing the system. More precisely, in a way that preserves the
pure states of the system (but the impure ones won't be preserved).
Consider a system with 3 possible states, and a probability state is
given by its barycentric coordinates (p1,p2,p3). Let it interact
with a measuring apparatus that will have the respective
probabilities a,b,c to give a result "yes" while preserving the
state if the system was respectively in each of the 3 pure states
(and thus the respective probabilities 1-a, 1-b, 1-c to give
the result "no").
What will be the state of the system after the measurement (in a concious sense) if the
result is "yes" ?
The total probability to get "yes" is: p = a.p1 + b.p2 + c.p3.
Before we knew it, for each possible state, the probability to
get it together with "yes" was (for each of them) a.p1, b.p2 and
c.p3.
Once we know that we got "yes", the new probabilities are (a.p1/p),
(b.p2/p) and (c.p3/p).
What is the effect on the triangle of probability states ? It maps
it into itself, preserving it globally, and preserving each vertex,
but the interior points are not fixed, as it is moved by a projective
transformation.
Projective transformations are familiar to our intuition as they
usually occur when a figure in a plane is viewed from space and
represented in perspective in another plane, so as to appear the
same when the latter plane is viewed from the right point. To
specify a projective transformation, all we need is to
choose the horizon line (the line that will go to infinity) in
the original plane, and once it is moved to infinity, the
remaining possible movements are affine transformations.
This horizon line is the line defined by the cancellation of
the denominator in the expression of the transformation. This
denominator is p (the probability to get "yes").
So, the above formula of what happens to the state during the
non-disturbing measurement, by the fact of finding that it gives
"yes", can be described geometrically by saying that it is the only
projective transformation which sends the zero probability line
(a.p1 + b.p2 + c.p3 = 0) to infinity, and which preserves each of
the 3 pure states.
Another characterization, is that this is the only projective
transformation which preserves each of the 3 pure states (vertices
of the triangle), and which moves the center of the triangle, to the
point that is the barycenter of vertices with weights a,b,c.
But if we both measure and disturb
the system, then this can produce any projective transformation
from the triangle of initial states into the triangle
of final states.
Before continuing, let us tell more about the duality between states
and measurements.
In a state, the sum of coordinates is 1, while for a measurement,
every coordinate is ≤ 1 and the sum can be anything but in fact
it does not matter: we can multiply it all by an arbitrary positive
real number, so as to make the sum = 1 if we wish : it only
changes the whole probability of having got the measurement result
we got, but in the case we already got it, the information obtained
on the system is the same, and only depends on the zero probability
line (which remains fixed when the probability is multiplied by a
constant).
So, if by mere convention we fix the sum of components of the
perception = 1, then this perception can be also represented as a
point of a new triangle.
This new triangle T* of perceptions, represents the set of all
straight ligns outside the triangle T of states. The vertices of T*,
which are the pure perceptions,
represent each of the 3 edges of T (in the role of zero-probability
lines), while each edge of T* represents the set of all lines
meeting T at precisely one given vertex.
As the set of all possible perceptions (straight lines around a
simplex) also forms a simplex, just like for states we shall call
While the evolution defines successive affine transformations for
triangles T from the past into the future, the retrospective
information given by a final measurement, successively defines (for
the dual triangles T*), projective transformations preserving the
center (from the future into the past). Because the center is the
perception that does not give any information on the system, thus is
equally uninformative all time long.
Correlation
Consider 2 physical systems forming together a big system.
Each pure state of the big system consists in the case when
each of the subsystems is in a specified pure state. Thus, the
number of possible states of whole system is the product of
those of each subsystems.
All other states are combinations (barycenters) of them.
For any state of the big system, we can consider the probabilistic
state of one subsystem while ignoring the other. But a
measurement of the one (and knowing the result) affects the state of
the other.
Every combination of states of the system is represented by a
matrix, what we will call the correlation
matrix where the raws and colums correspond to the
pure states of each subsystem (and all coefficients are positive).
As we explained previously, matrices with positive coefficients
define projective transformations from a simplex into another. The
matrices of evolution we previously considered, satisfied more
conditions, that forced properties on these projective
transformation (being affine on the one way, preserving the center
in the other way). But the correlation matrix has no more such
conditions, so that the projective transformation defined by it,
does not have any such further requirement to satisfy. But what does
this transformation operate on ?
In fact, this is the projective transformation mapping the
perceptions simplex A* of one subsystem (dual to its states
simplex A), into the simplex B of states of the other: having got a
perception on the one, gives an information about the other and
thus modifies its (probabilistic) state (which represents what we
know of it); and its transpose, maps the simplex B* of perceptions
of the latter, into the simplex A of states of the former.
This transformation, maps the center of B* (the uninformative
perception on the second system), into the element of A which
expresses what the first system looks like while ignoring the
second; and similarly when the roles of both systems are exchanged.
Thus in correlations between two or more systems, a measurement
result of one affects the statistical state of all others. This
change may be seen as both affecting the future of the
measured system, and of the other ones, as
related by going "backwards" in time, from the measurement
through the past "common cause" of the states of all systems (by
retrospective revisions of the initial state), and from that point
then forward in time. But this is merely an interpretation: a
possible mathematical representation of things where the question of
"what is real" need not make much sense.
Next : introduction to quantum
physics (theory of quantum states and measurements)
Back to the Set Theory main page with list of physical theories