Special Relativity theory made intuitive
This page is being reworked
0. Introduction
Special Relativity and General Relativity
are two theories of
physics that explain our usual experience of (3-dimensional)
"space" and (1-dimensional) "time", as differently perceived
dimensions for a given observer, of one physical reality of a
4-dimensional geometrical
space called "space-time", whose points are the events.
Roughly speaking, space-time has "4 dimensions = 3 space dimensions
+ 1 time dimension", as each event can be specified by 4 coordinates
= 3 of "space position" + one of "time". But, like in usual space
geometry, such a split of dimensions into a list of coordinates is
not given by nature but will depend on a conventional choice of a
coordinates system, among different possible, relatively convenient
ones in a way or another: different methods of observation (choices
of observers, measuring instruments and protocols) or theoretical
representation, may lead to different ways of attributing a "position"
and a "time" to each event, putting space-time in correspondence
with a Cartesian
product of a 3-dimensional space with a time line.
Independently of any such artifacts, Relativity theories
describe space-time in itself as a mathematical system very similar
to a 4-dimensional Euclidean space (the straightforward
generalization of our usual space to 4D), with few differences.
For these differences, it is qualified as another geometry: the
geometry of Minkowski. Still, space intuition (thinking of
physical objects as timeless, i.e. fixed) is our best natural intuition to
represent space-time and express physics there.
Like Euclidean spaces, Minkowski spaces can be similarly
conceived and formalized with any dimension n+1 (for n
"space dimensions" + 1 time dimension), starting with n=1 (a
Minkowski plane), and visually imagined in cases n= 1 or 2
(which suffice to understand most situations in higher dimensional
spaces, where some dimensions do not intervene and may be
forgotten, so that the 4-dimensionality of space-time is not really a
problem).
Similarities and differences between lengths and times
As usual for each coordinate in coordinate systems,
Relativity theories reject the classical idea of time as an invariant
function from space-time to a "time line". (Only
quantum
physics provides a questionable metaphysical argument for
reintroducing a universal time parameter, but which, in the
current state of science, is not measurable anyway)
Instead, times have essentially the same nature as distances :
amounts of time (durations) are distances (or lengths) in the time
dimension of space-time. Indeed in physics, values of space
distances and time intervals are convertible into each other by the
constant coefficient c=
299,792,458 m/s which is "the speed of light in the void".
However we shall still formally treat times (durations) as a different
kind of quantities from space distances, because
- Time "feels different" from space ;
- Usual time intervals are much larger than usual space widths
(as converted by c), so the explicit use of c
helps to keep track of the articulation between predictions of Relativity and
the Galilean space-time, which approximates the predictions of Relativity
in the limit case of small speeds;
- This formal distinction of space and time quantities, will provide a
formal tool for "magically" switching between the geometries of
Euclid and Minkowski.
-
In Minkowski geometry, the entanglement of time with space
dimensions is not as complete as in Euclidean geometry.
Indeed, the geometry of Minkowski makes an absolute distinction
between types of directions (intervals separating 2 events):
between space intervals (in space-like direction, intuitively "faster
than light") and time intervals (in time-like direction, "slower than light").
So, the measures of space and time intervals must
be defined separately. This allows to fix a time order,
that is an order
relation preserved by rotations, and which orients both time intervals
and the limit case of light intervals ("at the speed of light"). This order
physically defines the causal dependence between events.
Theories list
General Relativity
extends the concepts of Special Relativity to include the
description of gravitation.
Or, rather than exactly being theories, they are combinations of
theories and concepts which can be classified as follows (part of a
larger classification
of physical theories):
|
Usual space
|
Special Relativity
|
General Relativity
|
Geometry |
3D Euclidean geometry
= Affine geometry
+ positive dot product
|
4D Minkowski
geometry:
= Affine
geometry (= flat)
+ dot product with signature (3,1)
|
4D Lorentzian manifold: space with
variable curvature,
gives Minkowski
geometry in local
approximations
|
Perception |
split as:
2D vision + 1D depth |
Split as 3D space +
1D time. Times measures are lengths of lines.
Small speeds (v << c) approx. gives
Galilean space-time.
|
Mechanics
|
Law of equilibrium :
minimal potential energy
|
Relativistic mechanics:
Least Action Principle,
conservation laws, E=mc2...
v << c : classical mechanics.
|
Einstein field equation deduced
from Einstein–Hilbert action.
v << c : Newton's gravitation.
|
A possible path to the Minkowski geometry consists in rigorous
mathematics (an axiomatic theory),
ignoring any connection with physics and time
perceptions there.
But let us present another path that is shorter and intuitive
(somehow "magical"), where Minkowski geometry is obtained by
"reversing" the "effects" of Euclidean geometry. Such a reversion
can be done either independently (by pure geometry), or inspired by
the split of space-time into space and time.
The steps will be the following.
- How an (n+1)-dimensional space can be perceived as a
n-dimensional space where things evolved in time
- The Relativity principle
- Relativistic "effects" vs. the classical (Galilean) approximation;
- The Minkowski geometry, that can be introduced in 3 ways (formally, by
figures or in a magical analytic way) giving the same "relativistic effects"
on space and time experience (away from classical space-time) as those
from a space-time with Euclidean geometry, but with opposite sign;
- The qualitatively different properties of Minkowski geometry
(causality order, direct involvement of light or light speed)
and their effects on actual space and time perceptions
(splitting space and time otherwise than in 1.);
- Law of equilibrium and relativistic mechanics.
1. How perceptions can split a space into space and time.
Articulating intuitive representations
In this section, we shall express the logical articulation between
- space-time geometry (as a mathematical theory aiming to
describe the physical world in itself)
- the subjective perceptions of time and space by individual
observers.
We are already used to perceiving the ordinary 3-dimensional
Euclidean space, as split between the following dimensions:
- Vision (2-dimensional)
- Depth (1-dimensional)
We still understand it as one space, through our skill of
articulating these 2 perceptions in our imagination when objects are
rotated, reflecting the change of direction of view to an object
which divides its dimensions between both perceptions.
In principle, similar articulations of the intuitions of space and
time representing a given phenomenon, may form an understanding of
space-time. The work of training our intuition to properly figure
out such transformations may ultimately be useful, but by lack of
daily experience to support it, it would be a too hard challenge for
a first learning step, an awkward choice unfortunately followed by traditional courses on the topic.
Instead, let us represent space-time by our usual intuition of
space: imagine physical reality as a "timeless" world extended in
an (n+1)-dimensional space (playing the role of
"space-time"). Then, time will be introduced as a feeling of
observers living "outside" that reality, which they perceive as an
n-dimensional world evolving along their subjective
(one-dimensional) "time".
This is not meant as any philosophical proposition on the nature
of time (any ontological question is out of subject) but :
- The geometric view forms a clearer, more intuitive initiation to
the mathematical structure of the theory, in better conformity with how
Special Relativity effectively serves as a foundation for the next steps
of theoretical physics
(relativistic mechanics, electrodynamics, general relativity and quantum
field theory).
-
Our form of introduction of time is the more precise way to give Relativity
its quality of a physical
theory, in the logical positivist sense of a prediction tool about the
objects of our perceptions: a theory needs articulations between
its mathematical content and the familiar language of perception,
in order to express predictions about perceptions, with clear
criteria of experimental verification.
Basic statements in geometric terms
As a trick to start understanding Relativity bypassing the
mathematical effort of learning a different geometry, let us first
focus on a restricted set of concepts and phenomena using aspects of
space-time geometry equally present in Euclidean geometry (starting with
arc lengths, then curvature... ). For these aspects, the
description becomes intuitive but remains conceptually correct
(despite its oddity) if in guise of space-time we imagine an
Euclidean space instead of the actual Minkowski geometry,
and we ignore any experiments involving light itself (which have
no equivalent in Euclidean geometry, since light has special properties
in Minkowski geometry with no Euclidean analogue).
Here are the core ideas, from which others will be derived:
- Any "particle", and generally any "small persisting object"
(as described by usual space and time perceptions), is, in
space-time reality, extended as a curve in the time dimension,
called its world line.
For example, the space-time extension of a human with height 1.8
meters, that can perceive time with a minimal resolution of about
0.06 second, can be seen as a mere line in space-time with a very
good approximation: this time resolution (the length resolution of
his world line) is larger than his size (the width of his world
line) by a factor of about
c×0.06 s/1.8m = c/(108 km/h) = 107.
Even the Earth can be seen as "small", as its size is less than one
tenth of light second.
- A clock is such an object, whose world line displays a
measure of its structure as (a segment of) an oriented affine line
(= 1-dimensional affine space) whose lengths are geometrically
defined as the arc lengths
of this line in space-time, typed as amounts of time (= counted
in seconds or any other "time unit"; this measure may consist in
successive marks counting regular intervals of arc lengths).
- The segment of affine line made of the world line of an
observer, is identified with the segment of his personal
(subjective) time line. The time lines of
different observers are thus basically independent of each other
(they may only compare their lengths and be related by geometric
structures of space-time).
To describe this intuitively: "during his subjective time", the
observer's consciousness is traveling through space-time (which is
"timeless"), along his world line, at a constant "speed",
like a train would follow a railway at constant speed or like a
light impulse follows an optical fiber. Thus, when an observer keeps
a clock with him (their world lines coincide), they agree about
"time" (giving it the same affine line structure) : he sees his
clock proceed at the "normal" regular rhythm.
This "speed" is the conversion factor between subjective times and
space-time distances, that (for a technical reason which will appear
later) should not be identified as c but will have another
name c'. It is a mere immaterial constant that cannot be
measured, since the subjective feeling of time by human
observers is not a physical quantity. Instead, it can only be
concretely fixed by a definition of "time" from clocks, thus
actually coming from the physical properties of objects with respect
to the lengths of their world lines: ultimately, the value of the
constant c' is purely conventional.
Curvature of world lines and acceleration
Both geometries of Euclid and Minkowski have a clear concept of
(extrinsic) curvature of
curves such as world lines. In our intuitive figuration of
subjective time, the curvature of a world line defines an
acceleration, in an orthogonal direction, of the mind of an
observer who "follows it at a constant speed" c'.
As both quantities (curvature and acceleration) are of different
kinds, they are not formally confused but proportional, with a
factor determined by c' :
Acceleration = c'2.curvature
World lines are straight where their curvature is constantly
zero, i.e. the object remains non-accelerated (if the
cancellation of the curvature is not persistent, but just happens at a point,
where it usually changes sign, this point is an inflection
point).
Physically, acceleration is indeed a measurable intrinsic property:
accelerated objects or observers perceive their own acceleration in the form
of a fictitious
force, precisely measured using an accelerometer.
The laws of mechanics will imply that objects are non-accelerated
as soon as they are isolated, i.e. not interacting with any
external object or field (mainly the electromagnetic field; the
gravitational field as expressed in General Relativity will be an
exception to this rule).
The inconvenience of traditional Special
Relativity courses
Traditional Relativity courses unfortunately follow a naive but awkward
path: as if it was natural or necessary, they undertake formulating the
theory by expressing space-time in the language and intuition
of distinct space and time, as naively imagined from perceptions (and
as they were previously assumed before the discovery of the theory), and
directly identified with the use of coordinates x,y,z of space,
t of time, and speed v defined by x = vt.
They keep following this way with its historical roots, in the names of
- A flawed sense of "physicality" as if space and time had to be a priori
admitted as "fundamental substances", or as if the relativity principle and
the constancy of the speed of light, had to be seen very "physical" and a
priori privileged as axioms to start with while expressed in terms of distinct
intuitions of space and time, rather than some more abstract, mathematical
formulation of the same or other concepts (such as that of "automorphisms"). No matter
that, ironically, the main lesson that will come from the theory (its real
meaning) will precisely dismiss the relevance of these initial formulations
(binding space with time as inseparable dimensions, and even dismissing
the familiar, 1-dimensional, intuition of time as an non-physical attribute of
space-time). This flawed trend looks similar to the usual bad habits of philosophers lost in vanity metaphysics and essentialism
mistook as ways to stick to reality. This traditional style of metaphysics
had to be rejected in favor of logical positivism as a better
epistemological basis for modern science. We need not feel
concerned that many science philosophers still didn't get it but it is such
a pity to see physics teachers infected as well.
- "Pedagogical" approaches of "rediscovery" of theories from the
same naive irrelevant viewpoints by which they were initially
discovered, with a strange focus on the care to prove the uniqueness of
the kind of world satisfying some specially given axioms which mysteriously got a
favorite status of "physical principles" (why not care to prove its existence as well ?).
Such a concern to provide a demonstration of
pseudo-rediscovery, coming here as if it was the right and necessary way to share a
scientific mindset, is anyway out of subject.
This is incidentally harmful in 2 ways:
- By diverting us from the primary (on-topic) concern
of providing an intuitive, understandable initiation to unfamiliar
concepts which would show their deep elegance, it wastes the efforts of students
by polluting their first understanding works with irrelevant approaches and big, obscure formulas:
the Lorentz transformation formulas by which the theory is so expressed, are
mostly irrelevant (the higher theories of physics based on Special Relativity
do not make any explict use of them !), and wastefully complicated as they hide the
similarity with Euclidean geometry. In fact, almost the same formulas could be used
as an accurate but ridiculously complicated expression of Euclidean geometry, to be
only taken seriously by a caricature of a blind being trying to practice geometry
by force of complications without a natural intuition of it.
- Presenting the few old initial arguments hides the huge extent of the
scientific confirmations which accumulated
since that time, no more well summed up in that way.
This left some cranks in the illusion that scientists are just dogmatically
trusting the initially proposed answers to old enigmas, and that all our
current confidence in our theories was still just based on those
steps so that they could still be defeated just by "re-explaining" or relativizing
the strength of these 1 century old arguments. Such impressions are of course
wrong, but clarifications require to optimize pedagogy first.
All this gave the theory a reputation of opposing all intuition (a counter-intuitiveness
that some teachers can even feel proud of "demonstrating" !). Indeed,
this way of heavily using a language while
undertaking prove its irrelevance, a way of decidedly sitting on the
branch that you are undertaking to saw, results in
making it uncomfortable, once done, to figure out where you end
up... unless another, more relevant understanding is provided to
rely on.
But anyway it is no
teacher's job in the academic system, to ever care to develop
clear explanations, updating the lectures to better reflect
the current professional understanding of things.
2. The Relativity principle
Cartesian and affine coordinate systems
A conventional split of an n-dimensional space E between
n individual dimensions means a choice of a coordinate system.
The most usual kind of coordinate systems for Euclidean spaces, is
the Cartesian coordinates. A more general concept, that of affine
coordinate system, can be formalized in 2 ways:
- As a list of n coordinates, which are affine functions
from E into the set ℝ of real numbers (or more generally
affine lines): this defines a bijection from the space into ℝn;
- As a frame, made of an origin O of E and a
basis of n vectors b1,..., bn
of its vector space: this defines the inverse bijection, from ℝn
into E, as (x1,...xn)↦
O + x1b1+...+
xnbn. Grouping
O with b1, they together define the first
axis of that frame as the straight line
{O + x1b1|x1∈ℝ}.
Then the real number x1 may be replaced
by the point
p = O + x1b1
of the first axis : (p, x2,...xn)↦
p + x1b2+...+
xnbn
Cartesian coordinates have 2 further requirements:
- Pairwise orthogonality of coordinates, which is equivalent to
the pairwise orthogonality of the basis vectors;
- All coordinates are faithful to the same unit of distance:
this can be expressed in 2 ways which are equivalent if the
pairwise orthogonality condition holds (but generally not
otherwise):
- Viewed in terms of coordinates, this means that all
coordinates, as functions from E to ℝ, measure by the
same unit the lengths in the lines defined as quotients of E
by these functions, and seen as inheriting the Euclidean
structure of E with the quantity types of its
distances; in other words, the linear parts of these functions
have the same norm (in the Euclidean vector space of linear
forms).
- Viewed in terms of frame, this means the norms of all
vectors b1,..., bn
are equal.
Inertial frames
Now let us examine another conventional split of dimensions, not
between individual dimensions, but between "time" (1-dimensional)
and "space" (3-dimensional, Euclidean), in a sort of compromise
between an observer's split between "time" and "space" perceptions,
and the mathematically convenient properties of affine or Cartesian
coordinate systems.
What an observer with his perceptions
may have in common with a coordinate system, is that if his
world line is straight (he is non-accelerated), then it may be taken as "time axis" of
a coordinate system. This axis has no preferred origin, but
has a well-defined "time vector" with orientation, and length
reflecting the conventional choice of a time unit. This length
measures, as time quantities, any space-time segment
in a time direction (parallel to the world line of the observer...or
not), by clocks whose world lines contain such a segment.
Time quantities will be treated as a distinct quantity type
from space lengths, but are clearly measurable from physical
phenomena which equally work for any observer.
To split space-time into "space and time", the data of the time axis
must be completed by a choice of 3D subspace of the space-time
vectors (playing a role similar to the choice of other basis vectors
of a frame), to be called "space vectors" and used for defining a
conventional relation of "simultaneity". The natural choice of such a space,
is the space of vectors orthogonal to the given time axis.
Let us call inertial frame the split of space-time into space
and time so defined (it will not coincide
with the direct perceptions of space and time by this observer,
but only approach it in the Galilean approximation). Its inverse
correspondence is the pair of projections:
- The "time" coordinate, defined as the projection on
the time axis, parallel to the space of "space vectors" (thus an orthogonal
projection if the space vectors are taken as orthogonal to the time axis):
this combines a time measurement with a simultaneity relation.
- The "position" of events, which is the quotient of space-time by
the direction of the time axis. It can be seen as projection to the
space of space vectors, parallel to the time axis; but the definitions
of the Euclidean structure on this vector space when seen as
subspace or as quotient, only coincide when this subspace is
the orthogonal to the time axis.
Relative speed
An object B is said to be at rest relatively to
(= in the frame of) another non-accelerated object A,
if B is also non-accelerated, and its (straight)
world line is parallel to that of A; equivalently,
the position of B in A's frame is constant.
Otherwise B is moving relatively to A.
The traditional way in physics to quantify the movement of an
object B relatively to an object A (the "difference
of direction" between their world lines), is its speed,
defined in A's inertial frame with components
(t,x) (where t is the time and x
is the space position), as the derivative (vector) of the position
x of B seen as a function of t.
Then B is non-accelerated when it is "seen" by A's
frame as a point "going at a constant speed" (in this sense sense of
speed vector: constant amplitude and constant direction).
In A's frame, the movement of a non-accelerated object
B with constant speed v (space vector of
that frame), is written as (the equation of its world line)
x = a + tv
for some fixed space position a; the speed is zero when
objects are at rest relatively to each other.
In geometry, the "difference of direction" between lines is more
usually defined by their angle. The relative speed v
between objects is related to the angle α between the directions
of their world lines, by
|v|= c'.tan α.
Thus, the speed of B relatively to A and vice-versa, have the same norm.
The above definition of the acceleration of an object at a given event by
the curvature of its world line, coincides with the one as derivative of the
speed defined in the inertial frame whose time axis is the tangent straight
line to this world line at that event.
The Relativity principle
The Relativity principle can be
expressed it in 2 ways, depending on the language used to describe
space-time:
- In the language of space and time, the laws of physics appear
the same for all non-accelerated observers; speed only qualifies an
object relatively to another object.
- In the language of Minkowski or Euclidean geometry,
all time directions (possible directions of world lines)
are similar, i.e. any one can be moved to any other one by some "rotation"
which is an automorphism of space-time geometry.
Moreover, for any inertial frame whose simultaneity is defined by
orthogonality to the time axis, "space" simply appears as a
3-dimensional Euclidean space, with no privileged direction
(its rotations form a subgroup of the group of automorphisms of
space-time that preserve the direction of the time axis).
3. Relativistic effects vs the Galilean approximation
The goal of this section will be to deduce from the above
construction with the Euclidean model of space-time, that
- When the relative speeds of objects are small compared
to c', properties of "space" and "time" approach a limit behavior,
that of a space-time system (geometry) called a classical or
Galilean space-time (which we shall not formalize here)
- Away from this limit, some differences in behaviors appear,
called "relativistic effects" or "paradoxes". They are
mere familiar features of geometry, but become strange-looking when
applied to space-time and re-expressed in terms of space and time
perceptions. For relative speeds v << c',
the amplitude of these effects is roughly proportional to a
factor of (v/c')2.
Rapidity
The advantage of expressing movement by angles instead of any
other function such as speed, is that angles are additive (which is what
the concept of angles is based on): for any 3 directions A,B,C in
the same plane, the angle from A to C is the sum of angles from A to B
and from B to C. But we still have to choose a unit to express them
numerically.
Let us call rapidity the measure of angles in the unit such that it
approximately coincides with speeds when they are small, i.e. in the
approximation of the Galilean space-time: in the framework of an
Euclidean space-time, this unit is (1/c') radians. Namely:
- An accelerated body towards a fixed space direction, gains a rapidity
equal to the integral of its acceleration over its subjective (clock) time;
- Rapidity φ and speed v are related by
v = c'. tan(φ/c') ≈
φ (1+φ2/3c'2)
(This definition of "rapidity" differs from its use by others, according
to wikipedia, giving it the unit similar to radians for Minkowski geometry).
The twin paradox
After 2 twins were initially together with the same age (or we can
equivalently consider 2 synchronized clocks), then separate to follow
different trips and then meet again, they may not have the same age
anymore. They may have spent different amounts of (subjective)
time to cross space-time from a point (event) to another following
their different paths in space-time between events of separation and
reunion, while always following them at the same speed c'.
This is because in geometry, between 2 given points, different lines
have different lengths. Between 2 successive events, the measure of time
length by the object with straight world line (segment), equals the time
distance between them. In a Euclidean space-time, this length would
be the shortest, while other curves can be any longer.
A generally simple way to calculate how the time length of a curved
world line between 2 events differs from that of the straight line,
consists in analyzing it in the inertial frame of that straight line: the
time coordinate t of that frame is related to the subjective time
T of the (accelerated) traveler and his rapidity φ with respect
to that frame at that time, by the differential equationdt/dT = cos(φ/c') ≈
1 − φ2/2c'2.
So, this "effect of movement" on time remains small when rapidities are small
compared to c'.
As an "effect" of Euclidean geometry in pure space experience,
when a long vehicle is observed progressing laterally (for example
from left to right), it seems to become shorter when turning to
the depth dimension (towards or away from the observer).
Classical time vs. relativity of simultaneity
The classical concept of time is that all inertial frames have the same time
coordinate. Between two inertial frames with rapidity φ between their time
axis, let us express the time coordinate t' of the one in terms of the
coordinates (t,x) of the other, where x, orthogonal to
t, is the space coordinate in the direction of movement (so that they
suffice to express t' while the rest of dimensions stay unaffected):
t' = cos(φ/c').t +
(sin(φ/c')/c').x
where the coefficients cos(φ/c') and sin(φ/c')/c' are
respectively the t' measure of the
time and space basis vectors of the (t,x) frame.
Time coordinates of both frames are close to each other when
φ << c' as the second
coefficient is also small, sin(φ/c')/c'≈
φ/c'2. Away from this limit, they differ; while the
effect on the first coefficient represents the previously mentioned time distorsion,
the second represents the effect of relativity of simultaneity.
Correspondingly in the above pure space experience, both wheels of the
vehicle which were hiding each other (such as both front wheels,
or both back wheels), then appear offset from each other.
Length contraction
As an "effect" of Euclidean geometry, defining the width of a sausage by the size of its
sections, the sausage becomes wider when cut along a plane that is not orthogonal to its axis.
The corresponding (opposite) relativistic effect is the famous effect of "length contraction"
of moving objects. However this is only what comes from some abstract calculations,
with hardly any practical case of experiment where this plays a role, since the act of
simultaneously recording the positions of both sides of a moving object in a given frame,
is usually quite unnatural. In fact, in the most natural way we might consider doing so,
that is "taking a picture" of a moving object as seen from the direction perpendicular to its
trajectory, this object does not appear contracted but turned, in such a way that the
considered "length contraction" looks like the perspective effect of looking at the object's
length from an angle. This will be explained with
the actual study of the geometric properties of light and visual appearances.
4. Minkowski geometry
The study of space-time with its geometry of Minkowski may be alternatively (and
still equivalently) expressed in 3 ways:
- A rigorous approach as a mathematical theory (modified from a
formalization of Euclidean geometry).
- A geometric approach based on false figures
- An analytic approach based on converting formulas from the Euclidean case
The geometric approach
A famous quote is "Geometry is the science of correct
reasoning on incorrect figures". This interestingly applies
to the understanding of the Minkowski geometry. Indeed, the
similarities between geometries of Euclid and Minkowski and the
fact that affine geometry is the same underlying both, means that
the representations of (intended figures in) space-time by our
ordinary intuition of space (drawn figures on paper), just behave
like studies of Euclidean geometry on figures that are distorted
as resulting from an affine transformation (between the intended
space to study and the space of representation). Namely, the
representation (drawing) is faithful concerning all affine structures
(properties of figures expressed in the language of affine
geometry), but wrong as concerns Euclidean structures
(circles, angles etc).
- Understand affine geometry, which underlies both geometries of
Euclid and Minkowski: how it differs with Euclidean geometry. You can
see it in (the green
parts of) this introduction
to the foundations of geometry.
- Study how how the remaining structures are related together (definable
from each other) by the same or similar statements between both geometries
of Euclid and Minkowski, with just some differences in resulting properties.
So, to see how it works, we can first consider the exercise of
studying Euclidean geometry as represented by such distorted
figures, that correspond to the intended ones through an affine
transformation.
Then, let us redefine the distorted structures (but without the
trick of drawing another figure that is directly faithful),
starting with the notion of circle (for the 2-dimensional case).
Namely, imagine that an ellipse is given, with the information
"This is a circle" (For 3-dimensional geometry we need to take an
ellipsoid to declare it as a "sphere"). This suffices to know what
are all the other circles : they are those obtained from this one
by translations and dilations.
Other structures are defined from it, by the formulas that relate
the different Euclidean structures together.
Now, the geometry of Minkowski can be obtained by the same
method, with almost the same list of structures (language) as
Euclidean ones, related together (definable from each other) by
the same rules, with a difference expressible in this way: there,
circles are no more particular ellipses (for the affine notion of
ellipse), but particular hyperbolas. So, instead of giving an
equivalence class of ellipses (or ellipsoid for dimensions beyond 2) for the
equivalence relation of correspondence by translations and dilations,
we must give a class of hyperbolas (hyperboloids) to be
declared the "circles" (spheres). Well, not all things work the same
(for example, circles no more have a finite area nor a finite
length), but almost, we just need to adapt things a little bit.
The analytic approach
Let us present a
simple and efficient method to convert the predictions and
"effects" from a geometry to the other.
Switching between
geometries of Euclid and Minkowski for space-time, keeps the same
list of resulting "relativistic effects" (deviations from the
Galilean space-time), whose range of order is that of
c'−2 (where c' is the
ratio between space and time units), but reverses their signs.
For example, in the twin paradox, while the traveler would become
older than his brother in an Euclidean space-time, he becomes younger
in a Minkowski space-time, as in the Minkowski geometry,
the straight world line (of the non-accelerated twin, staying
on Earth) is the longest possible world line (time-like curve, measured
in time length) between 2 successive events, with time length equal to their distance.
In fact, not only
the first approximation of effects has the opposite amplitude
(−c−2 instead of c'−2), but
even the exact formulas of "space and time" behavior, can be converted
between the cases of geometries of Euclid and Minkowski for
space-time, by the formal substitution c'2
= −c2
Applying this to trigonometric functions
of the rapidities in the sense of their analytic
continuation (as functions of c'−2), changes
them into hyperbolic functions :
cos(φ/c') = ch(φ/c)
c' sin(φ/c') = c sh(φ/c)
c' tan(φ/c') = c th(φ/c)
In particular, as ch(φ/c)>1 the effect on time of a moving
object is time dilation.
The mathematical definitions of the hyperbolic functions will come in the detailed study
of the Minkowski plane, with tools interestingly similar with the way complex
numbers can be used to study Euclidean plane geometry and define
trigonometric functions.
The rigorous approach
For people already familiar with linear
algebra, the definition goes as this: it is a 4-dimensional affine
space whose space of vectors is endowed with a symmetric bilinear
form with signature (3,1). For others, I started to offer
a long path of preliminaries through different
aspects of the foundations of mathematics, the
foundations of geometry,
detailed study of
affine geometry, and vector spaces
(sorry I did not yet complete the long path to there I wish to
write).
On the basis of affine geometry, the remaining structures of either geometry
of Euclid or Minkowski come as defined
from the one which is the most fundamental, as it is the one actually involved in
most of the fundamental formulas of higher level theoretical physics:
the inner product. (A few other physics formulas use another structure
which is somehow even more fundamental than the inner product,
but is not familiar : the spinor spaces).
Its study takes the following steps
- Introduce the notion of inner product on a vector space, that
is a non-degenerate, symmetric bilinear form. A quadratic
space is a vector space with a structure of inner product.
- Inner products in an n-dimensional vector space are
classified by their signature,
that is an oriented pair
of integers (p,q) with p+q=n.
This also classifies quadratic spaces, except that those with
signatures (p,q) and (q, p) are
essentially equivalent.
- An n-dimensional affine space (n >1) whose
space of vectors is quadratic, is an Euclidean space if the
signature is (n,0); it is a Minkowski space if the
signature is (n-1,1) (or (1,n-1)), to mean that
there are "(n-1) space dimensions and 1 time dimension".
We can express Euclidean geometry
as axiomatic theory, then modify there the axiom specifying
the signature, to formally obtain the geometry of Minkowski.
Infinitesimal rotations
An important part I still have to write...
Main ideas:
As the metric is a symmetric bilinear form on either the space of vectors or inversely on
its dual (which become non-equivalent in the degenerate case), infinitesimal rotations can be expressed
as derived from antisymmetric bilinear forms (if the metric was antisymmetric, i.e. in symplectic geometry,
infinitesimal rotations would be derived from symmetric bilinear forms).
In the vectorial 2-dimensional case there is only 1 dimension for antisymmetric forms, thus for rotations.
This transforms the metric into the vector field of rotational speeds.
Thus, in the Euclidean vector plane, the number i with square -1, the expression of rotations by a small angle a
by exp(ia) and the differential equations relating cos and sin.
What it becomes with a different metric ; geometric presentation, and differential equations relating ch and sh.
But this hyperbolic case has something more : it can also be viewed as split into 2 separate operations on reals, by taking the light coordinates system.
Example of relativity of simultaneity
We cannot exactly synchronize clocks along the equator of the Earth: starting so
from one point and going on, where both
ends meet, the Western clock of the line (synchronized with its East side), has an
advance over the one synchronized with its West side, with offset
vL/c2, where v = 465.1 m/s is the equatorial rotation
velocity, and L= 40,075 km is the length of
the equator.
Result : (465.1 m/s)× 40 075 km/c^2 = 2.074×10−7 s.
5. The specific properties of space and time perceptions in
Minkowski geometry
Describe effects directly involving light or light speed, with
amplitudes in the order of magnitude of v/c (instead
of (v/c)2), with no Euclidean analogue.
The distinction between space-like and time-like directions, added to the affine
geometry, is an equivalent presentation of the structure of the
geometry of Minkowski itself.
- Study in details the 2-dimensional Minkowski geometry
(one space dimension with time): describe its rotations and the
amplitudes of the Doppler effects; introduce the
hyperbolic functions that play the same role there as the
trigonometric functions play in Euclidean geometry.
- All relative speeds of objects are lower than c...
while this "speed" of light in the void is not relative but the
same for all observers (fixed by the laws of physics).
- Even information cannot travel "faster than" c.
Concretely, from a given source (event) to a given destination,
light signals straight in the void (if possible) always arrive
first (together with all electromagnetic and
gravitational waves).
- Between observers meeting on an event with relative speed,
visually perceiving space by light, the correspondence between
their spheres of vision is an isomorphism of inversive geometry,
while the behavior of depth (distance) is linked to the Doppler
effect.
Rapidity and speed
For attributing to a given physical movement some numerical value of
its "speed", requires to specify how quantities are physically measured
and how the value of the speed is computed from these measurements.
The speed of light c, like any speed, can be physically defined
as a ratio of a distance to a time (the values of the space and time
coordinates in a given frame), by the following procedure:
When a beam of light makes a round trip between two objects A
and B at relative rest and with distance d from each other
(from A to B and then back from B to A, for example
the Earth and the Moon), a clock that remained on A measures the time
interval 2d/c between sending the beam and getting it back.
In the geometry of Minkowski, the speed v is related with
the rapidity φ by v = c . th(φ/c),
So, while values of rapidity can grow without limit (for example
in the case of a constant acceleration, up to values much higher than c),
v only approaches c (exponentially): the "speed of light"
somehow works as a horizon (like an infinity) that can be approached but
not reached.
If we were to discuss the relative
movement between 2 galaxies in the universal
expansion taking them at the same age (instead of how the
one is actually seen by the other), it would be better expressed by
their relative rapidity, also abstractly defined as approached by the sum of
relative speeds between neighbors in a chain of aligned galaxies, all
taken at the same age. This can take any value without limit as long as the cosmological
principle holds. In this sense, we might say that "the visible
Universe expands much faster than the speed of light".
Twin paradox
Not all curves of space-time can be world lines of massive particles, but only those
whose tangent direction at each point is time-like; moreover, only these have a well-defined
length, which is positive. Such world lines connecting 2 given successive events can
be any shorter than the time distance between ends, with possible time lengths down to zero.
The zero length is approached by world lines of massive particles with high rapidities,
i.e. with speeds approaching c; it can only be reached by the world lines of
particles "going at the speed of light" away and then
back. The directions with speed c (also with a time orientation, from past to
future) are called the light directions
in space-time, a concept which is independent of the inertial frame in which
speeds are measured. Such particles are the massless particles.
The most common of these are the photons (particles of light).
As photons have no electric charge, they happen
to have speed c just if they travel in the void (through a possible electromagnetic
field but no electric charge), in which case they have no interactions (except being
emitted, reflected and absorbed), so that their world line is straight there (or a succession
of segments separated by reflection events).
Visual appearances
Actually, direct perceptions do not split space-time into 2
sensations (nor 4 coordinates), but 3 components:
- Time (1-dimensional)
- Vision (2-dimensional)
- Depth (1-dimensional)
The most directly available way for an observer to define the "time"
of an event, is given by the time on his own clock when he visually
perceives it. But then, observers "at different places" may perceive
the same events in different orders, since, roughly speaking, "an
event is perceived earlier by the nearest observer", due to the
finite speed c of light propagation.
The concept of inertial frames we first introduced (where
different observers agree on the time of events = their clocks are properly synchronized, if
they are at rest relatively to each other),
amounts to first taking this visual perception to label events by
space and time coordinates (labeling events by the time of
perception by an observer), then correcting it by taking account of
the delay of perception due to the distance to the observer.
6. Relativistic mechanics
Finally, Special Relativity is also the best framework to
introduce the Least
Action Principle, that is the foundation of relativistic
mechanics, which then reduces to the laws of classical
mechanics by approximation to the Galilean space-time. Here we
get the famous E=mc2.
The role of straight lines in relativistic mechanics
The Minkowski geometry of space-time is involved in all laws of physics
(expressed as mathematical
theories) that accept it as framework, that is, when
gravitation is absent (neglected): in relativistic mechanics and
QFT.
But there is an aspect of relativistic
mechanics, that can be expressed only using the affine
geometry of space-time.
There, the notion of straight line, that is the basic structure of
affine geometry, appears in this way :
Any world line of an isolated particle (i.e. not subject to any
external force) is a straight line of space-time, i.e. is
non-accelerated.
Through the axioms of affine geometry, this determines the
larger set of all straight lines (there are other straight lines
that cannot be the world lines of any particles because they are
"faster than light", as will be explained later).
The old .pdf text to download
Sorry this is just a draft written long ago, unfinished translation
from the French version, and not recently updated. It focused too
much on philosophical preliminaries before entering the theory
(spending pages to say obvious useless things - so I practiced
myself the bad habits of philosophers, and don't like to remember
this... even if some of these remarks may also have a legitimate
place but only elsewhere),
as compared to the way I would present things now. Here it is (.pdf,
210 kb, 18 pages).
(It was first written in French in
2002 with further parts written in 2003, then this translation into
English of some beginning of the text was done in august 2004).
1. Introduction
2. The strangeness of Relativity theory
3. New presentation of special relativity theory
3.1. The core logic of the theory
3.2. Link with the experience
3.3. Deformed study of the Euclidean plane geometry
Further parts in the French version, not translated :
3.4. Visions of the relativity of simultaneity with one
space dimension
3.5. Relativistic transformations of images
4. The mechanics of equilibrium, classical and relativistic
mechanics (including E=mc2), phase space
5. Foundations of statistical mechanics
6. Introduction to quantum physics (including the EPR paradox)
Back to:
List of physics
theories
Set Theory and foundations of
mathematics homepage
Contact : trustforum at gmail.com