Special Relativity theory made intuitive

This page is being reworked

0. Introduction

Special Relativity and General Relativity are two theories of physics that explain our usual experience of (3-dimensional) "space" and (1-dimensional) "time", as differently perceived dimensions for a given observer, of one physical reality of a 4-dimensional geometrical space called "space-time", whose points are the events. Roughly speaking, space-time has "4 dimensions = 3 space dimensions + 1 time dimension", as each event can be specified by 4 coordinates = 3 of "space position" + one of "time". But, like in usual space geometry, such a split of dimensions into a list of coordinates is not given by nature but will depend on a conventional choice of a coordinates system, among different possible, relatively convenient ones in a way or another: different methods of observation (choices of observers, measuring instruments and protocols) or theoretical representation, may lead to different ways of attributing a "position" and a "time" to each event, putting space-time in correspondence with a Cartesian product of a 3-dimensional space with a time line.

Independently of any such artifacts, Relativity theories describe space-time in itself as a mathematical system very similar to a 4-dimensional Euclidean space (the straightforward generalization of our usual space to 4D), with few differences. For these differences, it is qualified as another geometry: the geometry of Minkowski. Still, space intuition (thinking of physical objects as timeless, i.e. fixed) is our best natural intuition to represent space-time and express physics there.
Like Euclidean spaces, Minkowski spaces can be similarly conceived and formalized with any dimension n+1 (for n "space dimensions" + 1 time dimension), starting with n=1 (a Minkowski plane), and visually imagined in cases n= 1 or 2 (which suffice to understand most situations in higher dimensional spaces, where some dimensions do not intervene and may be forgotten, so that the 4-dimensionality of space-time is not really a problem).

Similarities and differences between lengths and times

As usual for each coordinate in coordinate systems, Relativity theories reject the classical idea of time as an invariant function from space-time to a "time line". (Only quantum physics provides a questionable metaphysical argument for reintroducing a universal time parameter, but which, in the current state of science, is not measurable anyway)
Instead, times have essentially the same nature as distances : amounts of time (durations) are distances (or lengths) in the time dimension of space-time. Indeed in physics, values of space distances and time intervals are convertible into each other by the constant coefficient c= 299,792,458 m/s which is "the speed of light in the void". However we shall still formally treat times (durations) as a different kind of quantities from space distances, because

Time "feels different" from space ;
Usual time intervals are much larger than usual space widths (as converted by c), so the explicit use of c helps to keep track of the articulation between predictions of Relativity and the Galilean space-time, which approximates the predictions of Relativity in the limit case of small speeds;
This formal distinction of space and time quantities, will provide a formal tool for "magically" switching between the geometries of Euclid and Minkowski.
In Minkowski geometry, the entanglement of time with space dimensions is not as complete as in Euclidean geometry.

Indeed, the geometry of Minkowski makes an absolute distinction between types of directions (intervals separating 2 events): between space intervals (in space-like direction, intuitively "faster than light") and time intervals (in time-like direction, "slower than light"). So, the measures of space and time intervals must be defined separately. This allows to fix a time order, that is an order relation preserved by rotations, and which orients both time intervals and the limit case of light intervals ("at the speed of light"). This order physically defines the causal dependence between events.

Theories list

General Relativity extends the concepts of Special Relativity to include the description of gravitation.

Or, rather than exactly being theories, they are combinations of theories and concepts which can be classified as follows (part of a larger classification of physical theories):

	Usual space	Special Relativity	General Relativity
Geometry	3D Euclidean geometry = Affine geometry + positive dot product	4D Minkowski geometry: = Affine geometry (= flat) + dot product with signature (3,1)	4D Lorentzian manifold: space with variable curvature, gives Minkowski geometry in local approximations
Perception	split as: 2D vision + 1D depth	Split as 3D space + 1D time. Times measures are lengths of lines. Small speeds (v << c) approx. gives Galilean space-time.
Mechanics	Law of equilibrium : minimal potential energy	Relativistic mechanics: Least Action Principle, conservation laws, E=mc²... v << c : classical mechanics.	Einstein field equation deduced from Einstein–Hilbert action. v << c : Newton's gravitation.

A possible path to the Minkowski geometry consists in rigorous mathematics (an axiomatic theory), ignoring any connection with physics and time perceptions there.

But let us present another path that is shorter and intuitive (somehow "magical"), where Minkowski geometry is obtained by "reversing" the "effects" of Euclidean geometry. Such a reversion can be done either independently (by pure geometry), or inspired by the split of space-time into space and time.
The steps will be the following.

How an (n+1)-dimensional space can be perceived as a n-dimensional space where things evolved in time
The Relativity principle
Relativistic "effects" vs. the classical (Galilean) approximation;
The Minkowski geometry, that can be introduced in 3 ways (formally, by figures or in a magical analytic way) giving the same "relativistic eﬀects" on space and time experience (away from classical space-time) as those from a space-time with Euclidean geometry, but with opposite sign;
The qualitatively different properties of Minkowski geometry (causality order, direct involvement of light or light speed) and their effects on actual space and time perceptions (splitting space and time otherwise than in 1.);
Law of equilibrium and relativistic mechanics.

1. How perceptions can split a space into space and time.

Articulating intuitive representations

In this section, we shall express the logical articulation between

space-time geometry (as a mathematical theory aiming to describe the physical world in itself)
the subjective perceptions of time and space by individual observers.

We are already used to perceiving the ordinary 3-dimensional Euclidean space, as split between the following dimensions:

Vision (2-dimensional)
Depth (1-dimensional)

We still understand it as one space, through our skill of articulating these 2 perceptions in our imagination when objects are rotated, reflecting the change of direction of view to an object which divides its dimensions between both perceptions.
In principle, similar articulations of the intuitions of space and time representing a given phenomenon, may form an understanding of space-time. The work of training our intuition to properly figure out such transformations may ultimately be useful, but by lack of daily experience to support it, it would be a too hard challenge for a first learning step, an awkward choice unfortunately followed by traditional courses on the topic.

Instead, let us represent space-time by our usual intuition of space: imagine physical reality as a "timeless" world extended in an (n+1)-dimensional space (playing the role of "space-time"). Then, time will be introduced as a feeling of observers living "outside" that reality, which they perceive as an n-dimensional world evolving along their subjective (one-dimensional) "time".

This is not meant as any philosophical proposition on the nature of time (any ontological question is out of subject) but :

The geometric view forms a clearer, more intuitive initiation to the mathematical structure of the theory, in better conformity with how Special Relativity effectively serves as a foundation for the next steps of theoretical physics (relativistic mechanics, electrodynamics, general relativity and quantum field theory).
Our form of introduction of time is the more precise way to give Relativity its quality of a physical theory, in the logical positivist sense of a prediction tool about the objects of our perceptions: a theory needs articulations between its mathematical content and the familiar language of perception, in order to express predictions about perceptions, with clear criteria of experimental verification.

Basic statements in geometric terms

As a trick to start understanding Relativity bypassing the mathematical effort of learning a different geometry, let us first focus on a restricted set of concepts and phenomena using aspects of space-time geometry equally present in Euclidean geometry (starting with arc lengths, then curvature... ). For these aspects, the description becomes intuitive but remains conceptually correct (despite its oddity) if in guise of space-time we imagine an Euclidean space instead of the actual Minkowski geometry, and we ignore any experiments involving light itself (which have no equivalent in Euclidean geometry, since light has special properties in Minkowski geometry with no Euclidean analogue).
Here are the core ideas, from which others will be derived:

Any "particle", and generally any "small persisting object" (as described by usual space and time perceptions), is, in space-time reality, extended as a curve in the time dimension, called its world line.

For example, the space-time extension of a human with height 1.8 meters, that can perceive time with a minimal resolution of about 0.06 second, can be seen as a mere line in space-time with a very good approximation: this time resolution (the length resolution of his world line) is larger than his size (the width of his world line) by a factor of about

c×0.06 s/1.8m = c/(108 km/h) = 10⁷.

Even the Earth can be seen as "small", as its size is less than one tenth of light second.

A clock is such an object, whose world line displays a measure of its structure as (a segment of) an oriented affine line (= 1-dimensional affine space) whose lengths are geometrically defined as the arc lengths of this line in space-time, typed as amounts of time (= counted in seconds or any other "time unit"; this measure may consist in successive marks counting regular intervals of arc lengths).
The segment of affine line made of the world line of an observer, is identified with the segment of his personal (subjective) time line. The time lines of different observers are thus basically independent of each other (they may only compare their lengths and be related by geometric structures of space-time).

To describe this intuitively: "during his subjective time", the observer's consciousness is traveling through space-time (which is "timeless"), along his world line, at a constant "speed", like a train would follow a railway at constant speed or like a light impulse follows an optical fiber. Thus, when an observer keeps a clock with him (their world lines coincide), they agree about "time" (giving it the same affine line structure) : he sees his clock proceed at the "normal" regular rhythm.
This "speed" is the conversion factor between subjective times and space-time distances, that (for a technical reason which will appear later) should not be identified as c but will have another name c'. It is a mere immaterial constant that cannot be measured, since the subjective feeling of time by human observers is not a physical quantity. Instead, it can only be concretely fixed by a definition of "time" from clocks, thus actually coming from the physical properties of objects with respect to the lengths of their world lines: ultimately, the value of the constant c' is purely conventional.

Curvature of world lines and acceleration

Both geometries of Euclid and Minkowski have a clear concept of (extrinsic) curvature of curves such as world lines. In our intuitive figuration of subjective time, the curvature of a world line defines an acceleration, in an orthogonal direction, of the mind of an observer who "follows it at a constant speed" c'.
As both quantities (curvature and acceleration) are of different kinds, they are not formally confused but proportional, with a factor determined by c' :

Acceleration = c'².curvature

World lines are straight where their curvature is constantly zero, i.e. the object remains non-accelerated (if the cancellation of the curvature is not persistent, but just happens at a point, where it usually changes sign, this point is an inflection point).
Physically, acceleration is indeed a measurable intrinsic property: accelerated objects or observers perceive their own acceleration in the form of a fictitious force, precisely measured using an accelerometer.

The laws of mechanics will imply that objects are non-accelerated as soon as they are isolated, i.e. not interacting with any external object or field (mainly the electromagnetic field; the gravitational field as expressed in General Relativity will be an exception to this rule).

The inconvenience of traditional Special Relativity courses

Traditional Relativity courses unfortunately follow a naive but awkward path: as if it was natural or necessary, they undertake formulating the theory by expressing space-time in the language and intuition of distinct space and time, as naively imagined from perceptions (and as they were previously assumed before the discovery of the theory), and directly identified with the use of coordinates x,y,z of space, t of time, and speed v defined by x = vt. They keep following this way with its historical roots, in the names of

A flawed sense of "physicality" as if space and time had to be a priori admitted as "fundamental substances", or as if the relativity principle and the constancy of the speed of light, had to be seen very "physical" and a priori privileged as axioms to start with while expressed in terms of distinct intuitions of space and time, rather than some more abstract, mathematical formulation of the same or other concepts (such as that of "automorphisms"). No matter that, ironically, the main lesson that will come from the theory (its real meaning) will precisely dismiss the relevance of these initial formulations (binding space with time as inseparable dimensions, and even dismissing the familiar, 1-dimensional, intuition of time as an non-physical attribute of space-time). This flawed trend looks similar to the usual bad habits of philosophers lost in vanity metaphysics and essentialism mistook as ways to stick to reality. This traditional style of metaphysics had to be rejected in favor of logical positivism as a better epistemological basis for modern science. We need not feel concerned that many science philosophers still didn't get it but it is such a pity to see physics teachers infected as well.
"Pedagogical" approaches of "rediscovery" of theories from the same naive irrelevant viewpoints by which they were initially discovered, with a strange focus on the care to prove the uniqueness of the kind of world satisfying some specially given axioms which mysteriously got a favorite status of "physical principles" (why not care to prove its existence as well ?). Such a concern to provide a demonstration of pseudo-rediscovery, coming here as if it was the right and necessary way to share a scientific mindset, is anyway out of subject.

This is incidentally harmful in 2 ways:

By diverting us from the primary (on-topic) concern of providing an intuitive, understandable initiation to unfamiliar concepts which would show their deep elegance, it wastes the efforts of students by polluting their first understanding works with irrelevant approaches and big, obscure formulas: the Lorentz transformation formulas by which the theory is so expressed, are mostly irrelevant (the higher theories of physics based on Special Relativity do not make any explict use of them !), and wastefully complicated as they hide the similarity with Euclidean geometry. In fact, almost the same formulas could be used as an accurate but ridiculously complicated expression of Euclidean geometry, to be only taken seriously by a caricature of a blind being trying to practice geometry by force of complications without a natural intuition of it.
Presenting the few old initial arguments hides the huge extent of the scientific confirmations which accumulated since that time, no more well summed up in that way. This left some cranks in the illusion that scientists are just dogmatically trusting the initially proposed answers to old enigmas, and that all our current confidence in our theories was still just based on those steps so that they could still be defeated just by "re-explaining" or relativizing the strength of these 1 century old arguments. Such impressions are of course wrong, but clarifications require to optimize pedagogy first.

All this gave the theory a reputation of opposing all intuition (a counter-intuitiveness that some teachers can even feel proud of "demonstrating" !). Indeed, this way of heavily using a language while undertaking prove its irrelevance, a way of decidedly sitting on the branch that you are undertaking to saw, results in making it uncomfortable, once done, to figure out where you end up... unless another, more relevant understanding is provided to rely on.
But anyway it is no teacher's job in the academic system, to ever care to develop clear explanations, updating the lectures to better reflect the current professional understanding of things.

2. The Relativity principle

Cartesian and affine coordinate systems

A conventional split of an n-dimensional space E between n individual dimensions means a choice of a coordinate system. The most usual kind of coordinate systems for Euclidean spaces, is the Cartesian coordinates. A more general concept, that of affine coordinate system, can be formalized in 2 ways:

As a list of n coordinates, which are affine functions from E into the set ℝ of real numbers (or more generally affine lines): this defines a bijection from the space into ℝⁿ;
As a frame, made of an origin O of E and a basis of n vectors b₁,..., b_n of its vector space: this defines the inverse bijection, from ℝⁿ into E, as (x₁,...x_n)↦ O + x₁b₁+...+ x_nb_n. Grouping O with b₁, they together define the first axis of that frame as the straight line {O + x₁b₁|x₁∈ℝ}. Then the real number x₁ may be replaced by the point p = O + x₁b₁ of the first axis : (p, x₂,...x_n)↦ p + x₁b₂+...+ x_nb_n

Cartesian coordinates have 2 further requirements:

Pairwise orthogonality of coordinates, which is equivalent to the pairwise orthogonality of the basis vectors;
All coordinates are faithful to the same unit of distance: this can be expressed in 2 ways which are equivalent if the pairwise orthogonality condition holds (but generally not otherwise):

Viewed in terms of coordinates, this means that all coordinates, as functions from E to ℝ, measure by the same unit the lengths in the lines defined as quotients of E by these functions, and seen as inheriting the Euclidean structure of E with the quantity types of its distances; in other words, the linear parts of these functions have the same norm (in the Euclidean vector space of linear forms).
Viewed in terms of frame, this means the norms of all vectors b₁,..., b_n are equal.

Inertial frames

Now let us examine another conventional split of dimensions, not between individual dimensions, but between "time" (1-dimensional) and "space" (3-dimensional, Euclidean), in a sort of compromise between an observer's split between "time" and "space" perceptions, and the mathematically convenient properties of affine or Cartesian coordinate systems.
What an observer with his perceptions may have in common with a coordinate system, is that if his world line is straight (he is non-accelerated), then it may be taken as "time axis" of a coordinate system. This axis has no preferred origin, but has a well-defined "time vector" with orientation, and length reflecting the conventional choice of a time unit. This length measures, as time quantities, any space-time segment in a time direction (parallel to the world line of the observer...or not), by clocks whose world lines contain such a segment. Time quantities will be treated as a distinct quantity type from space lengths, but are clearly measurable from physical phenomena which equally work for any observer.
To split space-time into "space and time", the data of the time axis must be completed by a choice of 3D subspace of the space-time vectors (playing a role similar to the choice of other basis vectors of a frame), to be called "space vectors" and used for defining a conventional relation of "simultaneity". The natural choice of such a space, is the space of vectors orthogonal to the given time axis.
Let us call inertial frame the split of space-time into space and time so defined (it will not coincide with the direct perceptions of space and time by this observer, but only approach it in the Galilean approximation). Its inverse correspondence is the pair of projections:

The "time" coordinate, defined as the projection on the time axis, parallel to the space of "space vectors" (thus an orthogonal projection if the space vectors are taken as orthogonal to the time axis): this combines a time measurement with a simultaneity relation.
The "position" of events, which is the quotient of space-time by the direction of the time axis. It can be seen as projection to the space of space vectors, parallel to the time axis; but the definitions of the Euclidean structure on this vector space when seen as subspace or as quotient, only coincide when this subspace is the orthogonal to the time axis.

Relative speed

An object B is said to be at rest relatively to (= in the frame of) another non-accelerated object A, if B is also non-accelerated, and its (straight) world line is parallel to that of A; equivalently, the position of B in A's frame is constant. Otherwise B is moving relatively to A.
The traditional way in physics to quantify the movement of an object B relatively to an object A (the "difference of direction" between their world lines), is its speed, defined in A's inertial frame with components (t,x) (where t is the time and x is the space position), as the derivative (vector) of the position x of B seen as a function of t.
Then B is non-accelerated when it is "seen" by A's frame as a point "going at a constant speed" (in this sense sense of speed vector: constant amplitude and constant direction).
In A's frame, the movement of a non-accelerated object B with constant speed v (space vector of that frame), is written as (the equation of its world line)

x = a + tv

for some fixed space position a; the speed is zero when objects are at rest relatively to each other.
In geometry, the "difference of direction" between lines is more usually defined by their angle. The relative speed v between objects is related to the angle α between the directions of their world lines, by

|v|= c'.tan α.

Thus, the speed of B relatively to A and vice-versa, have the same norm.
The above definition of the acceleration of an object at a given event by the curvature of its world line, coincides with the one as derivative of the speed defined in the inertial frame whose time axis is the tangent straight line to this world line at that event.

The Relativity principle

The Relativity principle can be expressed it in 2 ways, depending on the language used to describe space-time:

In the language of space and time, the laws of physics appear the same for all non-accelerated observers; speed only qualifies an object relatively to another object.
In the language of Minkowski or Euclidean geometry, all time directions (possible directions of world lines) are similar, i.e. any one can be moved to any other one by some "rotation" which is an automorphism of space-time geometry.

Moreover, for any inertial frame whose simultaneity is defined by orthogonality to the time axis, "space" simply appears as a 3-dimensional Euclidean space, with no privileged direction (its rotations form a subgroup of the group of automorphisms of space-time that preserve the direction of the time axis).

3. Relativistic effects vs the Galilean approximation

The goal of this section will be to deduce from the above construction with the Euclidean model of space-time, that

When the relative speeds of objects are small compared to c', properties of "space" and "time" approach a limit behavior, that of a space-time system (geometry) called a classical or Galilean space-time (which we shall not formalize here)
Away from this limit, some differences in behaviors appear, called "relativistic effects" or "paradoxes". They are mere familiar features of geometry, but become strange-looking when applied to space-time and re-expressed in terms of space and time perceptions. For relative speeds v << c', the amplitude of these effects is roughly proportional to a factor of (v/c')².

Rapidity

The advantage of expressing movement by angles instead of any other function such as speed, is that angles are additive (which is what the concept of angles is based on): for any 3 directions A,B,C in the same plane, the angle from A to C is the sum of angles from A to B and from B to C. But we still have to choose a unit to express them numerically.
Let us call rapidity the measure of angles in the unit such that it approximately coincides with speeds when they are small, i.e. in the approximation of the Galilean space-time: in the framework of an Euclidean space-time, this unit is (1/c') radians. Namely:

An accelerated body towards a fixed space direction, gains a rapidity equal to the integral of its acceleration over its subjective (clock) time;
Rapidity φ and speed v are related by v = c'. tan(φ/c') ≈ φ (1+φ²/3c'²)

(This definition of "rapidity" differs from its use by others, according to wikipedia, giving it the unit similar to radians for Minkowski geometry).

The twin paradox

After 2 twins were initially together with the same age (or we can equivalently consider 2 synchronized clocks), then separate to follow different trips and then meet again, they may not have the same age anymore. They may have spent different amounts of (subjective) time to cross space-time from a point (event) to another following their different paths in space-time between events of separation and reunion, while always following them at the same speed c'.

This is because in geometry, between 2 given points, different lines have different lengths. Between 2 successive events, the measure of time length by the object with straight world line (segment), equals the time distance between them. In a Euclidean space-time, this length would be the shortest, while other curves can be any longer.

A generally simple way to calculate how the time length of a curved world line between 2 events differs from that of the straight line, consists in analyzing it in the inertial frame of that straight line: the time coordinate t of that frame is related to the subjective time T of the (accelerated) traveler and his rapidity φ with respect to that frame at that time, by the differential equation

dt/dT = cos(φ/c') ≈ 1 − φ²/2c'².

So, this "effect of movement" on time remains small when rapidities are small compared to c'.
As an "effect" of Euclidean geometry in pure space experience, when a long vehicle is observed progressing laterally (for example from left to right), it seems to become shorter when turning to the depth dimension (towards or away from the observer).

Classical time vs. relativity of simultaneity

The classical concept of time is that all inertial frames have the same time coordinate. Between two inertial frames with rapidity φ between their time axis, let us express the time coordinate t' of the one in terms of the coordinates (t,x) of the other, where x, orthogonal to t, is the space coordinate in the direction of movement (so that they suffice to express t' while the rest of dimensions stay unaffected):

t' = cos(φ/c').t + (sin(φ/c')/c').x

where the coefficients cos(φ/c') and sin(φ/c')/c' are respectively the t' measure of the time and space basis vectors of the (t,x) frame.

Time coordinates of both frames are close to each other when φ << c' as the second coefficient is also small, sin(φ/c')/c'≈ φ/c'². Away from this limit, they differ; while the effect on the first coefficient represents the previously mentioned time distorsion, the second represents the effect of relativity of simultaneity.

Correspondingly in the above pure space experience, both wheels of the vehicle which were hiding each other (such as both front wheels, or both back wheels), then appear offset from each other.

Length contraction

As an "effect" of Euclidean geometry, defining the width of a sausage by the size of its sections, the sausage becomes wider when cut along a plane that is not orthogonal to its axis. The corresponding (opposite) relativistic effect is the famous effect of "length contraction" of moving objects. However this is only what comes from some abstract calculations, with hardly any practical case of experiment where this plays a role, since the act of simultaneously recording the positions of both sides of a moving object in a given frame, is usually quite unnatural. In fact, in the most natural way we might consider doing so, that is "taking a picture" of a moving object as seen from the direction perpendicular to its trajectory, this object does not appear contracted but turned, in such a way that the considered "length contraction" looks like the perspective effect of looking at the object's length from an angle. This will be explained with the actual study of the geometric properties of light and visual appearances.

4. Minkowski geometry

The study of space-time with its geometry of Minkowski may be alternatively (and still equivalently) expressed in 3 ways:

A rigorous approach as a mathematical theory (modified from a formalization of Euclidean geometry).
A geometric approach based on false figures
An analytic approach based on converting formulas from the Euclidean case

The geometric approach

A famous quote is "Geometry is the science of correct reasoning on incorrect figures". This interestingly applies to the understanding of the Minkowski geometry. Indeed, the similarities between geometries of Euclid and Minkowski and the fact that affine geometry is the same underlying both, means that the representations of (intended figures in) space-time by our ordinary intuition of space (drawn figures on paper), just behave like studies of Euclidean geometry on figures that are distorted as resulting from an affine transformation (between the intended space to study and the space of representation). Namely, the representation (drawing) is faithful concerning all affine structures (properties of figures expressed in the language of affine geometry), but wrong as concerns Euclidean structures (circles, angles etc).

Understand affine geometry, which underlies both geometries of Euclid and Minkowski: how it differs with Euclidean geometry. You can see it in (the green parts of) this introduction to the foundations of geometry.
Study how how the remaining structures are related together (definable from each other) by the same or similar statements between both geometries of Euclid and Minkowski, with just some differences in resulting properties.

So, to see how it works, we can first consider the exercise of studying Euclidean geometry as represented by such distorted figures, that correspond to the intended ones through an affine transformation.
Then, let us redefine the distorted structures (but without the trick of drawing another figure that is directly faithful), starting with the notion of circle (for the 2-dimensional case). Namely, imagine that an ellipse is given, with the information "This is a circle" (For 3-dimensional geometry we need to take an ellipsoid to declare it as a "sphere"). This suffices to know what are all the other circles : they are those obtained from this one by translations and dilations.
Other structures are defined from it, by the formulas that relate the different Euclidean structures together.

Now, the geometry of Minkowski can be obtained by the same method, with almost the same list of structures (language) as Euclidean ones, related together (definable from each other) by the same rules, with a difference expressible in this way: there, circles are no more particular ellipses (for the affine notion of ellipse), but particular hyperbolas. So, instead of giving an equivalence class of ellipses (or ellipsoid for dimensions beyond 2) for the equivalence relation of correspondence by translations and dilations, we must give a class of hyperbolas (hyperboloids) to be declared the "circles" (spheres). Well, not all things work the same (for example, circles no more have a finite area nor a finite length), but almost, we just need to adapt things a little bit.

The analytic approach

Let us present a simple and efficient method to convert the predictions and "effects" from a geometry to the other.
Switching between geometries of Euclid and Minkowski for space-time, keeps the same list of resulting "relativistic effects" (deviations from the Galilean space-time), whose range of order is that of c'⁻² (where c' is the ratio between space and time units), but reverses their signs. For example, in the twin paradox, while the traveler would become older than his brother in an Euclidean space-time, he becomes younger in a Minkowski space-time, as in the Minkowski geometry, the straight world line (of the non-accelerated twin, staying on Earth) is the longest possible world line (time-like curve, measured in time length) between 2 successive events, with time length equal to their distance.
In fact, not only the first approximation of effects has the opposite amplitude (−c⁻² instead of c'⁻²), but even the exact formulas of "space and time" behavior, can be converted between the cases of geometries of Euclid and Minkowski for space-time, by the formal substitution

c'² = −c²

Applying this to trigonometric functions of the rapidities in the sense of their analytic continuation (as functions of c'⁻²), changes them into hyperbolic functions :

cos(φ/c') = ch(φ/c)
c' sin(φ/c') = c sh(φ/c)
c' tan(φ/c') = c th(φ/c)

In particular, as ch(φ/c)>1 the effect on time of a moving object is time dilation. The mathematical definitions of the hyperbolic functions will come in the detailed study of the Minkowski plane, with tools interestingly similar with the way complex numbers can be used to study Euclidean plane geometry and define trigonometric functions.

The rigorous approach

For people already familiar with linear algebra, the definition goes as this: it is a 4-dimensional affine space whose space of vectors is endowed with a symmetric bilinear form with signature (3,1).

For others, I started to offer a long path of preliminaries through different aspects of the foundations of mathematics, the foundations of geometry, detailed study of affine geometry, and vector spaces (sorry I did not yet complete the long path to there I wish to write).

On the basis of affine geometry, the remaining structures of either geometry of Euclid or Minkowski come as defined from the one which is the most fundamental, as it is the one actually involved in most of the fundamental formulas of higher level theoretical physics: the inner product. (A few other physics formulas use another structure which is somehow even more fundamental than the inner product, but is not familiar : the spinor spaces).
Its study takes the following steps

Introduce the notion of inner product on a vector space, that is a non-degenerate, symmetric bilinear form. A quadratic space is a vector space with a structure of inner product.
Inner products in an n-dimensional vector space are classified by their signature, that is an oriented pair of integers (p,q) with p+q=n. This also classifies quadratic spaces, except that those with signatures (p,q) and (q, p) are essentially equivalent.
An n-dimensional affine space (n >1) whose space of vectors is quadratic, is an Euclidean space if the signature is (n,0); it is a Minkowski space if the signature is (n-1,1) (or (1,n-1)), to mean that there are "(n-1) space dimensions and 1 time dimension".

We can express Euclidean geometry as axiomatic theory, then modify there the axiom specifying the signature, to formally obtain the geometry of Minkowski.

Infinitesimal rotations

An important part I still have to write...
Main ideas:
As the metric is a symmetric bilinear form on either the space of vectors or inversely on its dual (which become non-equivalent in the degenerate case), infinitesimal rotations can be expressed as derived from antisymmetric bilinear forms (if the metric was antisymmetric, i.e. in symplectic geometry, infinitesimal rotations would be derived from symmetric bilinear forms).
In the vectorial 2-dimensional case there is only 1 dimension for antisymmetric forms, thus for rotations. This transforms the metric into the vector field of rotational speeds.
Thus, in the Euclidean vector plane, the number i with square -1, the expression of rotations by a small angle a by exp(ia) and the differential equations relating cos and sin.
What it becomes with a different metric ; geometric presentation, and differential equations relating ch and sh. But this hyperbolic case has something more : it can also be viewed as split into 2 separate operations on reals, by taking the light coordinates system.

Example of relativity of simultaneity

We cannot exactly synchronize clocks along the equator of the Earth: starting so from one point and going on, where both ends meet, the Western clock of the line (synchronized with its East side), has an advance over the one synchronized with its West side, with offset vL/c², where v = 465.1 m/s is the equatorial rotation velocity, and L= 40,075 km is the length of the equator.
Result : (465.1 m/s)× 40 075 km/c^2 = 2.074×10⁻⁷ s.

5. The specific properties of space and time perceptions in Minkowski geometry

Describe effects directly involving light or light speed, with amplitudes in the order of magnitude of v/c (instead of (v/c)²), with no Euclidean analogue.
The distinction between space-like and time-like directions, added to the affine geometry, is an equivalent presentation of the structure of the geometry of Minkowski itself.

Study in details the 2-dimensional Minkowski geometry (one space dimension with time): describe its rotations and the amplitudes of the Doppler effects; introduce the hyperbolic functions that play the same role there as the trigonometric functions play in Euclidean geometry.
All relative speeds of objects are lower than c... while this "speed" of light in the void is not relative but the same for all observers (fixed by the laws of physics).
Even information cannot travel "faster than" c. Concretely, from a given source (event) to a given destination, light signals straight in the void (if possible) always arrive first (together with all electromagnetic and gravitational waves).
Between observers meeting on an event with relative speed, visually perceiving space by light, the correspondence between their spheres of vision is an isomorphism of inversive geometry, while the behavior of depth (distance) is linked to the Doppler effect.

Rapidity and speed

For attributing to a given physical movement some numerical value of its "speed", requires to specify how quantities are physically measured and how the value of the speed is computed from these measurements.

The speed of light c, like any speed, can be physically defined as a ratio of a distance to a time (the values of the space and time coordinates in a given frame), by the following procedure:

When a beam of light makes a round trip between two objects A and B at relative rest and with distance d from each other (from A to B and then back from B to A, for example the Earth and the Moon), a clock that remained on A measures the time interval 2d/c between sending the beam and getting it back.

In the geometry of Minkowski, the speed v is related with the rapidity φ by v = c . th(φ/c), So, while values of rapidity can grow without limit (for example in the case of a constant acceleration, up to values much higher than c), v only approaches c (exponentially): the "speed of light" somehow works as a horizon (like an infinity) that can be approached but not reached.

If we were to discuss the relative movement between 2 galaxies in the universal expansion taking them at the same age (instead of how the one is actually seen by the other), it would be better expressed by their relative rapidity, also abstractly defined as approached by the sum of relative speeds between neighbors in a chain of aligned galaxies, all taken at the same age. This can take any value without limit as long as the cosmological principle holds. In this sense, we might say that "the visible Universe expands much faster than the speed of light".

Twin paradox

Not all curves of space-time can be world lines of massive particles, but only those whose tangent direction at each point is time-like; moreover, only these have a well-defined length, which is positive. Such world lines connecting 2 given successive events can be any shorter than the time distance between ends, with possible time lengths down to zero.
The zero length is approached by world lines of massive particles with high rapidities, i.e. with speeds approaching c; it can only be reached by the world lines of particles "going at the speed of light" away and then back. The directions with speed c (also with a time orientation, from past to future) are called the light directions in space-time, a concept which is independent of the inertial frame in which speeds are measured. Such particles are the massless particles.
The most common of these are the photons (particles of light). As photons have no electric charge, they happen to have speed c just if they travel in the void (through a possible electromagnetic field but no electric charge), in which case they have no interactions (except being emitted, reflected and absorbed), so that their world line is straight there (or a succession of segments separated by reflection events).

Visual appearances

Actually, direct perceptions do not split space-time into 2 sensations (nor 4 coordinates), but 3 components:

Time (1-dimensional)
Vision (2-dimensional)
Depth (1-dimensional)

The most directly available way for an observer to define the "time" of an event, is given by the time on his own clock when he visually perceives it. But then, observers "at different places" may perceive the same events in different orders, since, roughly speaking, "an event is perceived earlier by the nearest observer", due to the finite speed c of light propagation.

The concept of inertial frames we first introduced (where different observers agree on the time of events = their clocks are properly synchronized, if they are at rest relatively to each other), amounts to first taking this visual perception to label events by space and time coordinates (labeling events by the time of perception by an observer), then correcting it by taking account of the delay of perception due to the distance to the observer.

6. Relativistic mechanics

Finally, Special Relativity is also the best framework to introduce the Least Action Principle, that is the foundation of relativistic mechanics, which then reduces to the laws of classical mechanics by approximation to the Galilean space-time. Here we get the famous E=mc².

The role of straight lines in relativistic mechanics

The Minkowski geometry of space-time is involved in all laws of physics (expressed as mathematical theories) that accept it as framework, that is, when gravitation is absent (neglected): in relativistic mechanics and QFT.
But there is an aspect of relativistic mechanics, that can be expressed only using the affine geometry of space-time.
There, the notion of straight line, that is the basic structure of affine geometry, appears in this way :
Any world line of an isolated particle (i.e. not subject to any external force) is a straight line of space-time, i.e. is non-accelerated.
Through the axioms of affine geometry, this determines the larger set of all straight lines (there are other straight lines that cannot be the world lines of any particles because they are "faster than light", as will be explained later).

The old .pdf text to download

Sorry this is just a draft written long ago, unfinished translation from the French version, and not recently updated. It focused too much on philosophical preliminaries before entering the theory (spending pages to say obvious useless things - so I practiced myself the bad habits of philosophers, and don't like to remember this... even if some of these remarks may also have a legitimate place but only elsewhere), as compared to the way I would present things now. Here it is (.pdf, 210 kb, 18 pages).

(It was first written in French in 2002 with further parts written in 2003, then this translation into English of some beginning of the text was done in august 2004).

1. Introduction
2. The strangeness of Relativity theory
3. New presentation of special relativity theory

3.1. The core logic of the theory
3.2. Link with the experience
3.3. Deformed study of the Euclidean plane geometry

Further parts in the French version, not translated :

3.4. Visions of the relativity of simultaneity with one space dimension
3.5. Relativistic transformations of images

4. The mechanics of equilibrium, classical and relativistic mechanics (including E=mc²), phase space
5. Foundations of statistical mechanics
6. Introduction to quantum physics (including the EPR paradox)

Back to:
List of physics theories
Set Theory and foundations of mathematics homepage
Contact : trustforum at gmail.com