Introduction to quantum theory

(Concepts of quantum states and measurements)

"Cling to the brush, I remove the ladder !"

Introduction and preliminaries

The laws of quantum physics are fundamentally probabilistic. Thus, to understand them, we need to express probabilistic laws of evolution. The geometric expression of Markov processes (the general case of classical probabilistic law of evolution for material systems) is a prerequisite here, as it provides the language of this presentation.
We are going to express the concepts of quantum states and measurements in a simplified but still mathematically accurate manner (compared to usual courses of quantum physics) as a variant of that mathematical description of Markov processes (in other words, in terms of how quantum probability differs from classical probability). This will describe and show the coherence of some famous "paradoxes" of quantum physics, making them almost seem natural and intuitive. It includes the concepts of indetermination, the role of the observer, and the description of the quantum system involved in the double slit experiment.
We shall also explain the details of the measurement process as a particular case of quantum evolution, by the concept of decoherence ; but that only works "for all practical purposes", not completely in ontological terms (the question of what is real). The remaining discrepancy between the role of measurement in the principles of the theory, and its explanation as a physical process, is the subject of the debate on the interpretations of quantum physics.

See also:

Some notes on the articulations between quantum and classical physics : wave/particle duality ; what is energy in quantum physics and why it is conserved ; how is the indetermination of energy compatible with its conservation.
Interpretations of quantum physics with views and arguments on how to make sense of quantum states and measurement : "what is this ? what is the connection between these mathematical concepts and the physical reality ? where is reality supposed to be represented there ?"

Axioms and fundamental properties of quantum states

We can obtain the basic principles of quantum theory, from the classical probabilistic concepts we just developed, by slight modifications and specifications, as follows (you may ask "Why are things this way ?" well, we just know that they are this way because experience has confirmed it countless times) :

To every physical system considered within finite limits of available space and energy, is associated a natural number n called the "number of possible states" of the system; this number can become arbitrarily large when more space or energy is offered. (Of course there are no walls in space, and the concept of energy remains to be defined, so that these limitation conditions are not well-defined here, but this strange claim happens to be true in practice anyway).
This number n completely determines the geometric shape of the set B of all states of the system; it is a volume in an affine space with dimension n²-1, which we will called a "quantum n-states shape". (This is the set of positive semi-definite Hermitian matrices with trace 1, but we shall not make use of this definition in this presentation).
Some states are "pure" (those of rank 1), others are composite; the set of pure states forms a (2n-2)-dimensional surface S enveloping B. In the case n=2, S is a sphere in a Euclidean 3-dimensional space.
The whole set B is made of all barycenters of pure states with positive coefficients; in other words, B is a convex set (thus for n=2, this is a ball, that is the volume inside the sphere);
No pure state can be obtained as a barycenter of any list of several other states (you can check this for the sphere !)
The natural evolution of a physical system staying "inside its box" without external interactions, consists in a rotation of this volume which occurs in a continuous way along time, letting every pure state remain among pure states. Any pure state can be sent to any other pure state by such rotations.
Only if a system interacts with the environment (exchanging disturbance with it), then the evolution can take the form of some other affine transformation, shrinking it into itself (sending pure states into composite states).
For every state, we can define the number k of states it is made of (the rank of this Hermitian matrix), that is, the minimum number of pure states necessary for obtaining it as a barycenter (with positive coefficients, as always). The list of these pure states is not unique, but all the pure states from all possible decompositions of our state as a barycenter of k pure states, form the (2k-2)-dimensional surface of the pure states enveloping the only quantum k-states shape containing our state in its interior. (It can be obtained by evolution from a system of k possible states)

Indeed consider the simplest example, n=2 : any point inside the sphere can be obtained as a barycenter of 2 "pure states", that is, points one the sphere. These are the 2 intersections of the sphere with any line going through the point. You can see that any point of the sphere fits with one of these possible decompositions.

Let us focus on this case n=2 to examine how things work there in more details.

The spin 1/2

The most natural case of a physical system with "2 possible states", is the spin 1/2 of a particle. The simplest and most common example of a particle with 2 states due to its spin 1/2 is the electron, so that we will fix the discussion on it, but some other particles such as the proton (hydrogen kernel), the neutron, and some other atoms, kernels or ions, have this property too (it does not matter whether a particle is elementary or not).

What is a spin ? The first idea for describing a spin, would be that of a rotating ball that must keep rotating because of the conservation of angular momentum. However a rotating ball has too much details that an electron does not have: we can draw a mark on the ball and see it moving around; the ball may stop spinning and become at rest, or spin at different speeds.
The electron, on the other hand, has no such details: it cannot stop spinning, and has no mark on its face that can be seen moving around. Its spin state only consists in the data of its angular momentum, and thus remains constant in time as long as it is not modified by interaction with the environment (namely, by the magnetic field). For any system, the momentum can only vary by integers (to multiply by the Planck constant). The electron has only 2 possible values of the momentum, ± 1/2, thus with a difference of 1.
In order to measure the spin of an electron and getting one of both possibilities (clockwise vs. counterclockwise), we first need to choose the direction of the axis around which this spin will be measured. And the probabilities of results will of course depend on the axis chosen (as a continuous change of possible choice ending up in exchanging both ends, will of course exchange both probabilities).

Before choosing an axis, any electron's spin is naturally in some state. Like any angular momentum, it is a pseudo-vector. This means it belongs to a 3-dimensional vector space, but its representation as a vector in our space depends on a convention of orientation of space, and is reversed when we change this convention. For example, the angular momentum of the Earth can be represented by a vector towards the North pole, but a representation by a vector towards the South pole would be an equally possible convention. We just have to fix the convention once for all.

So, once this space orientation convention is fixed, the ball B of all spin states of an electron, whose surface is the sphere S of pure states, is figured by a ball in space.

Spin measurement

Let us describe measurements of this spin.
As before, each possible measurement result goes with a probability calculated as an affine function from B to real numbers, and more precisely into [0,1]. It can be any such function. So, it can be represented geometrically by the data of both parallel planes P₀ and P₁ where this affine function, extended to the whole space, would take the values 0 and 1 (so, outside B, and having B between them).
In the case of a binary (yes/no) measurement, the other possible measurement has the complementary probability (so that the sum is 1), represented with P₀ and P₁ exchanging their roles.

Now that we have specified "what is to be measured" (the probabilities of measurement results as depending on the initial state), what can be the state of the system after the measurement ?

Contrary to classical physics, quantum physics cannot admit a measurement being done on an elementary system without a physical interaction with it, that disturbs it (there is no more such a thing as a non-disturbing measurement). We need a measuring apparatus to interact with a system, and let the result of the measurement appear in a macroscopic way, where its description can be summed up (approximated) in the form of classical probabilities that we first presented, thanks to the process of decoherence.
We shall precisely describe the form of this necessary disturbance happening during measurement (effect of the physical processes making the measurement).

Instead of a non-disturbing measurement, we have the concept of a least-disturbing measurement. Let us describe its effects geometrically, for the spin of the electron.

The simplest case is the case of a complete measurement, that is where the probability 0 and 1 planes are tangent to S at two opposite points. This measurement collapses the spin onto the point of tangency which is the only pure state having the probability 1 of giving the observed result. As the two possible measurement results collapse the spin onto 2 opposite points, this is why we say that the "number of possible states" of the spin is 2.

This collapsing effect works more generally for any case when P₀ is tangent to the sphere, disregarding whether P₁ is also tangent or not, and so collapses the spin onto the maximum probability point (opposite to the 0 probability one).

Indeed, we already explained with classical probabilities, that the meaning of a measurement, and thus its effect on the state of the system, does not change if the function that gave the probability of reaching it, was multiplied by a constant.

In this sense, just like in the classical case, the set of all measurements has the same geometrical shape (a ball) as the set of all states (and this correspondence also works for any other "number of states"). To see this, you just need to divide the probability function of a measurement, by its value at the center of S, which will thus become 1 (and divide again the result by 2 if you want it to give a meaningful probability, with values in [0,1] over the sphere). In a Cartesian coordinates system (for the 3-dimensional space containing the sphere), you just need to reinterpret the coefficients (a,b,c) of this function (x,y,z) -> ax+by+cz+1, as the coordinates of the measurement in a space of measurements.

In other words, a measurement, as specified by its zero-probability plane outside the sphere, will be represented by the point inside the sphere, on the line from the center and orthogonal to the plane, and at a distance from the center which is the inverse of the distance of the center to the plane (if the sphere has radius 1), and on the opposite side.
This way, each measurement is represented by the point where it sends the center of the sphere (the totally undetermined state) according to its least-disturbing effect.
Each pure measurement is figured as the element of the sphere where it has its maximal probability, while others are figured inside it.

So, there are many other possible sorts of least-disturbing measurements: binary measurements where one possible result collapses the spin while the other doesn't; or where none does; measurements with arbitrary numbers of possible results, with arbitrary respective probability functions on B, provided that they are positive, affine, and that their sum is 1 all over B.

Weak measurements

Now let us describe other cases, when the measurement result does not happen to provide certainty on the state of the system,. i.e. where P₀ is not tangent, but away from S. Then the effect is that of a projective transformation of the space that sends P0 to infinity, and globally preserves B and S : each pure state becomes another pure state.
Only two pure states remain fixed (in the least-disturbing case): those that were nearest and furthest to P₀.

(These projective transformations of the 3-dimensional space that preserve a sphere, are also those acting on the set of speeds considered as relatively to different observers according to Special Relativity theory: the elements of the sphere define the speed vectors whose length correspond to the speed of light, thus expressing the fact that going at the speed of light, is a property that does not depend on the movement of the observer that measures this speed.)

More remarks

We can see here that the concept of non-perturbing measurement cannot make sense in general: not all pure states (points of the sphere) can be preserved in such a projective transformation. Only two can, and so must be specified to make sense of the "non-perturbing" claim.

Popular accounts of quantum physics mention the Heisenberg inequalities. One of these inequalities say that the position and the momentum of a particle cannot be both determined, and the more precisely one of these quantities is known, the less the other is.
What we just explained about the spin, already presents such an indetermination: it is neither possible to measure nor predict the spin of the electron along several axis at the same time.

Does the observer actually affect what is being observed?

What we can say may depend on some specific aspects of the experiment considered. If you consider just one closed physical system then we can say yes, physically, the observation of its state requires to physically interact with it, and thus affect it. For example if you have one electron and want to measure its spin in the up/down direction, you must interact with it so that if its spin was previously rightwards (which means : a 100% probability of finding it so if you measure it in the left-right direction), then the required interaction for the up/down measurement process has the physical effect of destroying the left/right component of the spin, and thus leading to equal chances for it to be found leftwards or rightwards if you measure it in the left-right direction after this. This is also the case in the double-slit experiment : the observation of which slit a particle goes through, requires some sort of physical interaction with this particle when it goes through. Such an explanation may turn out to be unsatisfactory when we consider some experiments where the effect on the probability of final result is greater than what seems to be the "probability of physically affecting the system", for example in the case when one slit is bigger than the other and you detect the particle going through the small slit. Other cases may be ambiguous, as quantum mechanics somehow mixes the change of state by interaction with its change as known probability which can work without physical interaction, such as how the act of watching the weather forecast modifies the probability that it will rain tomorrow...

Energy and evolution

The evolution of a physical system is determined by the energy differences between its possible states.
We will describe the situation in the case of the spin of the electron, but the same law applies to any other system as well. The explanation will be based on some concepts of classical mechanics. Many concepts of classical mechanics are no more valid in quantum theory, however some properties like those we will mention here, still apply somehow and can help to understand the situation intuitively.

The electron has a magnetic moment associated to its spin. This means that it behaves like a little magnet with the same orientation as its spin. Like any magnet, its interaction with an external magnetic field gives it a potential energy that is minimal when the magnetic moment is aligned with the magnetic field, and maximal when they are opposite. When the magnetic moment is not aligned with the magnetic field, the magnetic field exerts a torque on the magnet, which in the case of ordinary magnets pushes them towards the minimum energy configuration, aligned with the field. But the axis of the electron's spin is not like a fixed object that is turned in the way forces push to turn it. Instead, as it is defined by the angular momentum, the torque exerted by the magnetic field produce a gyroscopic precession of this spin around the direction of the magnetic field.

Now let us express the situation in the terms of quantum physics.

One of the Heisenberg inequalities says that the energy and the time cannot be both determined. Thus, whenever the energy of a system has an exact well-defined value, nothing can happen to it along time.

The spin has two possible states, and thus two possible values of the energy (when the environment is classically fixed). Each of both pure states of the spin along the direction of the magnetic field, has a well-defined value of the energy. For any other state of the spin, the energy in undetermined.
The measurement of the energy of the electron, coincides with the measurement of its spin along the direction of the magnetic field.
These two pure states of well-defined energy remain fixed in time, and give the axis of the rotation of the set B of all spin states along time.
The frequency of this rotation is proportional to the difference of energy between both possible values of the energy. This rotational movement of the spin, being also a rotation of the magnetic momentum of the electron which affects the surrounding magnetic field, generates an electromagnetic wave. This is the frequency of the photon emitted by the electron, by which it will lose its energy in the long term, and reach its state of lowest energy.
But to say this, means that we don't consider the spin of the electron as an isolated system anymore.

Other comments on energy in quantum physics

The photon

The quantum theory of electromagnetism is very complex with strange properties, but here we will focus on the simple case of a single photon with a well-defined frequency and propagating in a unique direction,

Like the electron, the photon has a spin, also called polarization, whose number of possible states is 2, even though the two values of its angular momentum are no more ±1/2 but ±1. Unlike the electron whose spin could be mesured along any axis in space, the spin of the photon is only defined with respect to the axis which is the direction of propagation. Still, it is possible to measure this spin along any other direction of its abstract sphere of states, but the (below described) correspondence between these abstract directions and our usual space-time differs from the spin 1/2 case; while the angular momentum that a photon may carry with respect to other directions, takes the form of the spatial configuration of the wave (position and direction of propagation) and will not be discussed here.

We can first understand the polarization in the case of a classical electromagnetic wave: this is a transverse wave, which means that the oscillation of the electric field is perpendicular to the direction of propagation (and the magnetic field too, which at every point of space-time, coincides with the electric field turned 90° around the direction of propagation).

On the abstract sphere of states of the photon's polarization, let us mark 6 points, configured like the centers of faces of a cube containing this abstract sphere; as a cube defines a coordinates system, so these points are expressed by their 3 coordinates.
Imagine that the photon propagates horizontally, so that the oscillation of the field happens in a vertical plane.

Let us also represent in the last column of the following table, another case of a 2-states system: the two possible states of the electromagnetic field that correspond to the undetermined presence of a given circularly polarized photon.

Abstract Position	Coordinates	Polarization type for a photon	Possibly absent circular photon
Left	(-1,0,0)	Linear, horizontal	Electric field to the left
Right	(1,0,0)	Linear, vertical	Electric field to the right
Front	(0,-1,0)	Linear, diagonal	Electric field to the top
Back	(0,1,0)	Linear, other diagonal	Electric field to the bottom
Top	(0,0,1)	Circular clockwise	One circular photon
Bottom	(0,0,-1)	Circular counter-clockwise	Zero photon

(The situation would be the same for the presence/absence of an electron as here with a photon, except that there is no direct measurement possible for this system in any other direction of that sphere than the presence/absence direction, in contrast with the case of the photon where such a measurement can be done in terms of the electric field. In other words, unlike the photon, it is not possible to "see" any oscillation in the electron, despite the fact that such an oscillation somehow exists relatively to some contexts such as the double-slit experiment, see below)

Note that in the case of the possibly absent photon, the electric field oscillates circularly at the frequency usually said to be the frequency of the photon, because each of both poles of the sphere (one photon/zero photon) has a different well-defined energy, which makes the sphere of states rotate around this axis at the frequency defined by the energy difference, which is the energy of the photon.

Also note that we have a nice correspondence between the sphere of spin states of the electron precessing in the magnetic field, and the sphere of states for the undetermined presence of a photon: this is the way the electron comes down to its minimum energy level by emitting a photon and thus transferring its state to it.
We described the case of the circularly polarized photon. It is what would be emitted by the spin of the electron in the direction of the magnetic field, in the case the photon would be detected in this direction, as the rotation of the electric field follows the rotation of the spin.

But the photon is emitted in all directions, so that if we only try to detect it in one direction, we may not get it as it may be going to another direction instead. In other words, the detection of the photon in a direction is correlated to its non-detection in another direction.
So, let us consider a photon detector all around the electron, with a way out in some angular area around the direction of the magnetic field.
The fact that no photon is detected around, defines a partial measurement with respect to the initial spin state of the electron: it is the sure outcome if the electron was already in its minimal energy level, but it also has a chance to be so if it was in the maximal energy level, as the photon can go by the exit (circularly polarized). Thus this case of absence of any photon emitted in other directions, makes a physical evolution defined by a projective transformation from the initial spin state of the electron to the final state of presence/absence of the circular photon emitted in the direction of the magnetic field; this transformation maps pure states into pure states.

Or, if we don't wait enough time to let the electron come down to its minimum energy level for sure, then the presence of an emitted photon will be correlated with the remaining spin state of the electron.

.

Next : concept of correlation in quantum theory, or
Quantum entropy (old draft).

Back to main page: Foundations of physics (table of contents), with List of physics theories.
Main site : Set Theory and foundations of mathematics