Let us introduce the features of this paradox, well-known by quantum physicists, for readers not familiar with the subject.

### No faster-than light communication

Special Relativity forbids any possibility of a (physical) causal influence faster than light (of a kind that would allow faster than light communication), for the following reason.
Let us call any 2 events (points of space-time) "independent" if they are separated by a space-like interval, which means that no imaginary moving point starting at one event and going "no faster than light in the void" can ever reach the other (this concept of "no faster than light in the void" is physically meaningful independently of a definition of speed, as it is measurable in principle by sending a beam of light in a tube of void that would be inserted in the same direction). Then:
• According to the geometry of space-time in Relativity, between any two independent events, it makes no sense to say that one physically precedes the other, because the laws of physics work the same when both events are exchanged. Precisely, any relation R between any events a and b, candidate to be called a "time order" relation, would have to be either symmetric between independent events (aRbbRa), or only defined relatively to a "viewpoint" (any kind of observer or measurement system, by which, for example, the "times" of events might be defined and compared), while any 2 independent events can be switched (replacing aRb by bRa) by taking another "viewpoint" which would be equally valid for physics, in the sense that all laws of physics work the same for both viewpoints.
• Thus, if a law of physics could ever provide any way for (what happens at) an event a to causally influence (send a signal to) an independent event b, then b (or an event coming just after it and thus able to use information from it) could also influence a as well by the same method (taking the relation: "a can causally influence b" in the role of time order).
• But this logically impossilble as it would lead to time contradictions (like with the idea of trying to change the past).

### Description of a quantum correlation experiment

Let us give the simplest example of a scenario involving a quantum correlation, which is possible in principle according to quantum theory, and showing how the physical universe as we know it cannot be explained by ideas of classical (locally causal) realism with its classical correlations. Of course the experiment was not actually done in this particular form (since nobody went to Mars yet !), but rather similar experiments have been done in diverse ways (with measuring devices instead of human observers), always precisely confirming the predictions of quantum theory, so that we cannot find any trace of a limit of applicability of quantum theory, that would prevent it from working in this scenario as well.

From a quantum process (such as spatially separating 2 electrons from a pair) we can produce a pair of 2 crosses with the following property of pure quantum correlation. At first, both crosses are attached together, oriented like × and +, so that they form a star with 8 vertices. On each of these 8 vertices is a bulb.
One experimenter keeps the + on Earth; another one takes the × with him to Mars.
Each experimenter is free to wait any amount of time, then, at any time, he can select one of both axis of his cross; as soon as he selects one axis, one and only one of both bulbs at the ends of this axis lights on.

And, whatever axis is chosen, both of its ends will have the same 50% probability to light on; but anyway, whatever they may choose, there is 85% probability that both vertices that light on (one on the Earth, the other on Mars) had been neighbors before separation. (The exact maximum value of the probability that quantum theory permits in perfect experimental conditions is (1+1/2)/2).

This remains true no matter if the experimenters follow any strategy or not regarding the time and choice of the axis they will select. It does not depend whether one experimenter selects of axis before or after the other, and before or after he could "see" (get information of) the other experimenter making the other selection of axis, or if they do it at exactly "the same time" (relatively to whatever frame of reference).

### Discussion

It can be as well interpreted by considering both measurements as happening in either order :
Assuming any cross to get measured first on one of its axis, it has 50% chance of giving either result. After this, the chances for the other cross (for either possible choice of direction) switch to 85% or 15% depending on the result of the first measurement. But we should point out that the same experiments with the same effective chances of results can be equally interpreted saying that crosses are measured in either order : both order assumptions give the same effective predictions, so that no experiment can distinguish between both.
We can compare this with the case of classical correlation : if both objects were prepared by giving at random one of a given list (with probabilities) of ways of preparing both objects together to react in a determistic manner. Then we can simply say that the observation of one system affects not the other system, but only what we know of the other system (thus the probabilities we can tell), making obvious the absence of faster than light communication, and the fact that both possible orders between measurements are mere interpretations, not a part of (and thus not affecting) the physical reality.
In fact, as we made clear by the way we introduced quantum processes by a similar mathematical structure to that of classical probabilistic processes, the mathematical language of quantum physics does not see any "fundamental difference" between classical correlations and quantum correlations (entanglement). In other words, the effective change that "occurs" (the content of the "action at distance"), that is, the change of the quantum state, looks mathematically similar to the subjective change of expectation of future measurement results, for an observer learning the result of the first observation. Except that... it would be mathematically incoherent to consider this information, that induces a change of expectation, as the a discovery of a reality that existed before the measurement was actually done.

Namely, in the above scenario of measurements, for any process (method of preparation) produced by a classical correlation (i.e. if the pair of crosses was prepared by a random choice of specific combinations of prepared reactions to each selection, that observations simply discover), you cannot do better than a 75% probability of having neighbor vertices light on. This 75% is obtained by preparing the crosses so that the vertices that would light on if their axis is selected, were 4 consecutive ones - but the choice of the series of 4 consecutive vertices among the 8 possible series was taken at random.
But the 85% chance, which can be obtained from a quantum process, cannot be explained saying that each experimenter discovers the answer locally determined by some hidden reality (a local hidden cause that is ready to answer either choice of measurement in a determined way).

Other examples of similar experiments, elsewhere on the web, with the advantage of working with some 100% certainties (instead of arguing on precise values of probabilities of uncertain results):

### "Spooky action at distance".

This experiment does not provide any means to send information faster than light : one experimenter's choice of axis has no power to choose the result on this axis, and thus leaves anyway the same 50% probability of any observed result by the other experimenter. This fact that this probability of 50% chances of observation result by the second experimenter remains unchanged by what the first experimenter could choose (the choice of the axis), means that this first choice of axis cannot constitute a means to send any message (even mixed with noise) to the second observer.
However, despite its strict absence of any effective means to send an information faster than light, this phenomenon has some paradoxical properties : a sort of conspiracy, or play of hide and seek, with the question whether or not any hidden signal goes faster than light, even while this cannot be used to effectively send a signal in practice.

If we consider things from the viewpoint that there is a hidden definite relation of simultaneity (a division of space-time into space-like slices, telling among any two events which event happens before which, even if they are separated by a space-like interval), then the EPR paradox can be interpreted as follows:
The first measurement done produces its random result with probabilities given by the state of the local system as it "globally" was (in the above example, it is probability 0.5 each), but, immediately (in the sense of this absolute simultaneity) this result affects the states of any other systems (probabilities of results if measured) that were correlated to the first.
How does it affect the other system ? Assume the first experimenter chose the vertical axis. Then the second one, no matter which diagonal he chooses, has only 15% chance for its result to differ in terms of the up/down description: but 15% chance to differ if choosing one diagonal, plus 15% сhance to differ if choosing the other, gives maximum 30% chances for the result to differ from one choice of diagonal to the other. But if the first experimenter chose the horizontal axis then the same argument goes in terms of right/left, which means minimum 70% chances to differ in terms of up/down, in contradiction with the previous case. As if the first experimenter actually acted at distance, on ... the correlation between what the second measurement would be for one choice of direction, and what it would be for the other. Then, what makes this "not really an action at distance", is the fact that the first observer could not choose the result of his observation, which was instead given "at random" strictly following its probability law, so that the probablities of result for the second measurement remain unchanged subjectively to an observer ignoring the result of the first measurement. Namely, as long as the result of the measurement is not specified, the "effect" on the states of other systems at distance, is neither the effect coming from one possible result, nor the one coming from the other possible result, but the average between them (barycenter weighted by probabilities), which is, in fact, the preservation of the state they had before the measurement.
As for the "effect" that is "transmitted at distance", i.e. "the chances for the result to differ from one choice of diagonal to the other", it is not really an effect because it is not measurable (as only one diagonal can be measured). So, the quantum correlation, while suggesting a sort of interconnectedness between distant places, does not provide any means to transmit information faster than light, because the formalism of quantum physics expresses it in the same language (as if it had the same nature) as classical correlation.

### The notion of counterfactual definiteness

Counterfactual definiteness is the idea that there would be a (hidden) reality of what would have been the result of a measurement other than (and physically no more doable after) the one actually made.

The EPR paradox shows that the probabilities of behavior of quantum-correlated systems cannot be explained as a hidden (random choice of) predetermination of all measurement results that was fixed once for all, prior to both measurements. At least, if we admit the "no-conspiracy" hypothesis, that is: this determination was not prepared by an entity that could either control or predict the choices made by experimenters. We shall admit this no-conspiracy hypothesis for the rest of this discussion. Thus, the idea of counterfactual definiteness (predetermination of results) cannot be valid in a global manner (independently of choices and of time order between them).
Also, it is remarkable that the mathematical formalism of quantum theory does not contain itself any trace of counterfactual definiteness (as its predictions are probabilistic, where probabilities are computed as numbers).

Now we are going to enter some more logical details : is it still possible to defend (find a logical possibility for) a partial applicability of the concept of counterfactual definiteness, such as, not as a definite but as a probabilistic information; relatively to the time order between choices, or relatively to the choice by the other experimenter, or relatively to either measurement result ?

First thought experiment : imagine that the × is measured first along one axis, but that the counterfactual measurement (what would have been found along the other axis) was made too; and that both results are "top". Knowing these 2 data, what probabilities of expectation can we give to the measurement results on the other system ?

(to be continued)

(Wikipedia : Counterfactual definiteness)

Related pages:
Introduction to quantum physics: states, correlations and measurements (describes the mathematical structure of quantum physics, and how it coherently provides such predictions)
Next pages:
The double-slit experiment
Quantum decoherence
Interpretations of quantum physics