Geometric expression of Markov chains

(Classical probabilistic processes in discrete spaces)

Let us start with an elementary expression of the most general case of a classical probabilistic law of evolution.
Such a law is expressed in a way that conforms to the following list of features and conditions; it will apply to any material system at any other time and place, provided it is known to start inside the same list of initial states, and the amount of time Dt and the external conditions are the same:

1) We have a list A of n possible states a among which some material system at a given time t, is assumed to be;

2) After the amount of time Dt (so, at the later time t'= t+Dt), we have another list B of m possible states b among which the system will be.

3) To every a in A and every b in B, a positive real numbers p(a,b) is assigned.

4) These numbers satisfy the condition: for every a in A, the sum of all p(a,b) among all b in B, is equal to 1. (Thus, p(a,b) ≤1 for every a,b).

5) The number p(a,b) is the probability that the system will be in the state b at time t' in the case it happened to be in the state a at time t.

Note that the last of these features is fundamentally different from the first four: the first four are purely mathematical claims, while the last one is metaphysical. Indeed there is something irreducibly metaphysical in the very concept of probability, expressing something from the concious type of existence that cannot be reduced to purely mathematical concepts, even though it is a very incomplete account of the concious type of existence.

The metaphysical nature of the concept of probability can be seen by developing the different ideas contained in it:

We are in a reality where only the state of the system at time t is unique and fixed, but the state of the system at time t' is not fixed, and several future states are still "coexisting" (mathematically) as genuine possibilities in the expression of the probabilistic evolution law.
There will only be one reality of what the state of the system at time t' will be; but this future reality is not real yet.

This succession of different versions (contents) of reality along time, comes from a feature of the concious type of existence as opposed to the mathematical one (see overview of metamathematics and metaphysical preliminaries for the details). From the concious type of existence, the concept of probability also inherits its dissymmetry with respect to the exchange between past and future.
You may have noted that, while purely mathematical, the above rule 4) is not symmetrical either; however, in the case of quantum physics, this condition (or what it will be come in the context of quantum theory) will be ensured by a time-symmetrical law.

Now, this concept of a probabilistic law of evolution is rather awkward, by its way of mathematically ruling the features of a "real" (beyond mere mathematical) type of existence for mathematical objects.
It may be considered coherent from a logical viewpoint, but does not fit all the non-mathematical aspects of the "real existence" (concious) it tries to describe.
There are in fact two problems here:

No mathematical law can ensure by itself the production of a concious existence. Concious perceptions can be absent from the location of the system at time t', therefore reducing the final state to a mere mathematical status, made of a superposition of the different possible states at time t', rather than as a unique state; by such superpositions, the physical system can keep evolving by itself until the final result will eventually be perceived later.
It hardly means anything to say that the evolution of a given system has obeyed a given probability law. All what can precisely be checked is whether the observed final state indeed had a non-zero probability, as resulting from the initial state according to a given law. Any other claim requires to check a large number of repetitions of an experience, and to hope that the unlikely event of large deviations of statistical observations from "the true probability law" won't happen. (This provides a starting concept for parapsychology, how mind interacts with matter by making the evolution diverge from mathematical probability laws.)

Now let us expand the first point further.

Probabilistic states

As a probabilistic superposition of states of a physical system can be obtained as a still unobserved outcome of a probabilistic evolution from a fixed initial state, we can consider further evolutions of the system starting from such superpositions.

Finally, probabilistic combinations of states will be considered as possible states of a physical system in their own right, aside the list of "pure states" that we initially assumed to be the possible states of the system. Then, predictions of results from these evolutions will be considered as depending on these more general cases of initial states, in a way that extends and is deducible from the probabilistic law that governed the evolution out of "pure states".

Let us represent such probabilistic states in a geometrical form, starting from the simplest cases.

The simplest interesting case is made of the combinations of 2 available "pure" states we shall denote here X and Y. Probabilistic combinations of them are represented by the real numbers p and p' which are the probabilities for the system to be respectively in the state X or Y. These numbers are positive and satisfy p+p'=1.

In the plane of coordinates (p,p'), this set of all states combining of X and Y, is represented by the points of the segment whose ends are the points (1,0) and (0,1), which can be identified as X and Y themselves (a system that has probability 1 of being in the state X, is in the state X). Any other point S of coordinates (p,p') inside this segment, is the barycenter of the two points X and Y with respective weights p and p'. In other words, for any positive numbers p and p'=1-p, if you put a ball of mass p at point X (1,0) and a ball of mass p' at point Y (0,1), then the gravity center of the whole will be at the point of coordinates (p,p').

The same constructions can be done with any higher number of pure states: with 3 states, the set of points of coordinates (p,p',p") such that all p,p' and p" are positive and p+p'+p"=1, is a triangle. With 4 states we obtain a tetrahedron.

The general concept of a classical probabilistic state of a system, has the following properties:

A finite natural number n is associated to the system, that is the "number of possible states" of the system (when it is known to satisfy some specified conditions...).
This number n completely determines the geometric shape of the set S of all states of the system. This is an (n−1)-simplex, with dimension n−1.
Some states are "pure", others are composite; pure states are those which cannot be obtained as a barycenter of any list of several other states with nonzero coefficients
The whole set S is made of all barycenters of pure states with positive coefficients; in other words, S is a convex set;
the set of pure states is made of n different points which are the vertices of this simplex;
For every state there is a unique list of pure states, of which it is a combination (barycenter) with strictly positive coefficients; in other words, such that it is in the interior of the simplex with these pure states as vertices.

Recommanded readings : Barycentric coordinate system - stochastic matrix

Now, let's come back to the evolution of a physical system, from a list A of n possible states to a list B of m possible states. How does it operate on the probabilistic states ? The numbers p(a,b) form the matrix of a linear transformation from the n-dimensional space with coordinates labeled by initial pure states, to the m-dimensional space with coordinates labeled by final pure states.
This linear transformation sends the simplex in the first space (defined by the equations: sum of coordinates = 1; all coordinates ≥ 0) into the simplex defined the same way in the final space.

Considering the affine spaces of respective dimensions n−1 and m-1, each defined by the equation (sum of coordinates = 1) containing these simplexes, this transformation is an affine transformation from the one to the other; affine transformations are the transformations that preserve barycenters.

Let's take an example : n=m=3.
Each of this simplex is a triangle; and the evolution defines an affine transformation sending the first triangle into the second triangle. The images of vertices of the first triangle are any 3 points inside the second triangle, and determine the whole transformation.

The justfication for the preservation of barycenters can be understood easily, as barycenters are a fundamental structure of these spaces: for any points X,Y,Z in one such probability space, no matter whether they are pure or not, any time we may need to consider a system that has respective probabilities p,p',p" to be either in state X,Y or Z, this can be summed up by saying that the state of this system is the barycenter of X,Y,Z with weights p,p',p".

Measurements

After having explained how the unobserved probabilistic evolution of systems can be geometrically expressed as a deterministic evolution of abstract points representing probabilistic states, let us now describe the expressions of concious observations of these states according to this geometric representation.

Coming back to the definitions we started with, and considering a system of n possible states, we can consider the case of a complete measurement, with n possibilities of perceptions corresponding to each of the n pure states of the system. So, if the system was in one of the n pure states then the perception would be determined with certainty; while it is undetermined with specific probabilities to give one or another result if the state of the system is not pure.

In practice, we usually don't have the chance to directly perceive by conciousness the state of physical systems. Instead, we use measurement apparatus that interact with the systems, then our body interacts with the measurement apparatus (or if we consider the direct vision of a system, then the eyes play the role of the measurement apparatus...). Anyway, let us assume that the measurement apparatus are convenient enough so that once the measurement on the system is made, the result will be ready for reading.

Thus we can describe the measurement process as a physical evolution from the observed system into a set of possible final states of the measurement apparatus. We already explained how physical evolution takes place. Then the convenience of the measurement apparatus consists in the fact that we can distinguish its pure states from each other with certainty (measure its state completely in the above sense). We forget for now how the measured system becomes after measurement, and will come back to it later.

This measurement result does not give much of the respective probabilities of the final states. Instead, it only specifies one of this states. What information does this give about the initial system ? It only gives a hint about the probability to have got the result we got. This can be interpreted as an indirect measurement of each of the previous intermediate states of the system during its evolution before it was measured.

This is expressed in the form of a long matrix multiplication, that gives the probability of a result as depending on the initial state:

Proba = L M E' E S
where:
S = initial state (column)
E = matrix of a first evolution of the system
E' = matrix of a second evolution of the system (any number successive evolutions can be inserted)
M = matrix of the measurement process (last evolution, into a final state of the measurement apparatus)
L = row matrix expressing the perceived result of the measurement, with one 1 and zeros, for example (0 1 0 0).

The matrix multiplication is associative:

Proba = L(M E' E S) = (L M)(E' E S) = (L M E') (E S) = (L M E' E)S

So, the observation L of the state (M E' E S) of the measurement apparatus, can be interpreted as the observation (L M) of the final state (E' E S) of the system, or as the observation (L M E') of the intermediate state (E S) of the system, or as the observation (L M E' E) of the initial state S of the system.

While the evolution determines the successive probabilistic state of any system from the past into the future, a final measurement retrospectively provides measurements of previous states.
Each linear form defined by the row matrices L, L M, L M E' and L M E' E takes values between 0 and 1 on the corresponding simplex. This is equivalent to saying that all its coefficients are between 0 and 1 (these are the respective probabilities to obtain the result starting from each pure state).

Finally, we have a sort of duality between probabilistic states and perceptions (=results of measurements), where states go forward in time while perception can be defined to apply retrospectively. While they are very different in reality, they may be somehow considered to be mathematically symmetric when exchanging the past and the future, but... it depends. More precisely, this symmetry only concerns perceptions not yet done (which are still in the future of conciousness), when it still makes sense to wonder what will be the probability of a result (because the probability of a result already obtained is 1). When a measurement is done, the result becomes past and modifies the state of the system.

Just like states, some possible perceptions are pure (with only one nonzero component), giving full information on the system, while others are impure (with several nonzero components).

This duality does not seem very symmetrical in the general case of classical probabilistic evolutions we are studying here, and has some problems.
For example, the previous states can be retroactively revised based on the final measurement (However we should be careful that these are not any concious retrocausalities, as previous perceptions remain unchanged in concious memory).
But these revised states do not naturally evolve with certainty towards this very measurement finally obtained.
However, we will see that in some aspects (but not all aspects), quantum theory is more symmetrical about this.

The concious (metaphysical) time should not be confused with the physical time. The concious concepts of "before" and "after" a measurement, mean before / after we know what is the result of the measurement, and do not always fit with the physical time when an apparatus interacts with a system to measure it. In the same way, concious perception should not be confused with physical perception defined as a measurement by physical interaction with a measuring apparatus).

Non-disturbing measurements

We first introduced impure perceptions retroactively in a way that destroys the system. But the same sort of impure perceptions can be operated by interaction with a measuring apparatus without disturbing the system. More precisely, in a way that preserves the pure states of the system (but the impure ones won't be preserved).

Consider a system with 3 possible states, and a probability state is given by its barycentric coordinates (p1,p2,p3). Let it interact with a measuring apparatus that will have the respective probabilities a,b,c to give a result "yes" while preserving the state if the system was respectively in each of the 3 pure states (and thus the respective probabilities 1-a, 1-b, 1-c to give the result "no").

What will be the state of the system after the measurement (in a concious sense) if the result is "yes" ?
The total probability to get "yes" is: p = a.p1 + b.p2 + c.p3.
Before we knew it, for each possible state, the probability to get it together with "yes" was (for each of them) a.p1, b.p2 and c.p3.
Once we know that we got "yes", the new probabilities are (a.p1/p), (b.p2/p) and (c.p3/p).

What is the effect on the triangle of probability states ? It maps it into itself, preserving it globally, and preserving each vertex, but the interior points are not fixed, as it is moved by a projective transformation.

Projective transformations are familiar to our intuition as they usually occur when a figure in a plane is viewed from space and represented in perspective in another plane, so as to appear the same when the latter plane is viewed from the right point. To specify a projective transformation, all we need is to choose the horizon line (the line that will go to infinity) in the original plane, and once it is moved to infinity, the remaining possible movements are affine transformations.
This horizon line is the line defined by the cancellation of the denominator in the expression of the transformation. This denominator is p (the probability to get "yes").

So, the above formula of what happens to the state during the non-disturbing measurement, by the fact of finding that it gives "yes", can be described geometrically by saying that it is the only projective transformation which sends the zero probability line (a.p1 + b.p2 + c.p3 = 0) to infinity, and which preserves each of the 3 pure states.

Another characterization, is that this is the only projective transformation which preserves each of the 3 pure states (vertices of the triangle), and which moves the center of the triangle, to the point that is the barycenter of vertices with weights a,b,c.

But if we both measure and disturb the system, then this can produce any projective transformation from the triangle of initial states into the triangle of final states.

Before continuing, let us tell more about the duality between states and measurements.

In a state, the sum of coordinates is 1, while for a measurement, every coordinate is ≤ 1 and the sum can be anything but in fact it does not matter: we can multiply it all by an arbitrary positive real number, so as to make the sum = 1 if we wish : it only changes the whole probability of having got the measurement result we got, but in the case we already got it, the information obtained on the system is the same, and only depends on the zero probability line (which remains fixed when the probability is multiplied by a constant).

So, if by mere convention we fix the sum of components of the perception = 1, then this perception can be also represented as a point of a new triangle.

This new triangle T* of perceptions, represents the set of all straight ligns outside the triangle T of states. The vertices of T*, which are the pure perceptions, represent each of the 3 edges of T (in the role of zero-probability lines), while each edge of T* represents the set of all lines meeting T at precisely one given vertex.

As the set of all possible perceptions (straight lines around a simplex) also forms a simplex, just like for states we shall call

While the evolution defines successive affine transformations for triangles T from the past into the future, the retrospective information given by a final measurement, successively defines (for the dual triangles T*), projective transformations preserving the center (from the future into the past). Because the center is the perception that does not give any information on the system, thus is equally uninformative all time long.

Correlation

Consider 2 physical systems forming together a big system.
Each pure state of the big system consists in the case when each of the subsystems is in a specified pure state. Thus, the number of possible states of whole system is the product of those of each subsystems.
All other states are combinations (barycenters) of them.

For any state of the big system, we can consider the probabilistic state of one subsystem while ignoring the other. But a measurement of the one (and knowing the result) affects the state of the other.

Every combination of states of the system is represented by a matrix, what we will call the correlation matrix where the raws and colums correspond to the pure states of each subsystem (and all coefficients are positive).

As we explained previously, matrices with positive coefficients define projective transformations from a simplex into another. The matrices of evolution we previously considered, satisfied more conditions, that forced properties on these projective transformation (being affine on the one way, preserving the center in the other way). But the correlation matrix has no more such conditions, so that the projective transformation defined by it, does not have any such further requirement to satisfy. But what does this transformation operate on ?

In fact, this is the projective transformation mapping the perceptions simplex A* of one subsystem (dual to its states simplex A), into the simplex B of states of the other: having got a perception on the one, gives an information about the other and thus modifies its (probabilistic) state (which represents what we know of it); and its transpose, maps the simplex B* of perceptions of the latter, into the simplex A of states of the former.

This transformation, maps the center of B* (the uninformative perception on the second system), into the element of A which expresses what the first system looks like while ignoring the second; and similarly when the roles of both systems are exchanged.

Thus in correlations between two or more systems, a measurement result of one affects the statistical state of all others. This change may be seen as both affecting the future of the measured system, and of the other ones, as related by going "backwards" in time, from the measurement through the past "common cause" of the states of all systems (by retrospective revisions of the initial state), and from that point then forward in time. But this is merely an interpretation: a possible mathematical representation of things where the question of "what is real" need not make much sense.

Next : introduction to quantum physics (theory of quantum states and measurements)
Back to the Set Theory main page with list of physical theories