Tensor products

This page under construction is to provide some technical details to justify more completely the formalism of tensors previously introduced.

There are several constructions of new dual pairs from given ones. In particular, as all vector spaces involved, any tensor space such as E⊠F needs to be seen as a component of a dual pair, namely (E⊠F,E'⊗F'). This and other constructions follow the same procedure as follows.

We start with a vector space H of ℝ-valued operations, thus a subspace of ℝ^X for some set X.
Here, X = E'×F' and H = E⊠F.
Now let ϕ : X → H* defined by ∀x∈X,∀y∈H, ϕ(x)(y) = y(x).
Finally, let H' be the subspace of H* generated by Im ϕ. This way, we have an operation ϕ : X → H'.
The fact H' is indeed a possible dual of H is easily deducible from Im ϕ ⊂H.
Thus in finite dimensional cases, H' = H*.
One reason to use this dual pair (H,H') instead of (H,H*) is that, when they differ in infinite dimensional cases, H' is anyway what we need for physics, while other possible elements of H* are often just "mathematical monsters" whose existence depends on the axiom of choice.
But the standard reason is that it is required for the tensor product to satisfy its below universality.

Here, H' = E'⊗F' and ϕ is the tensor product ⊗ : E'×F' → E'⊗F'. It has both properties

It is bilinear : ∀x∈E', ∀T∈ E⊠F, T_ij x_i ∈F and the same on the other side.
It is universal for this bilinearity : for any dual pair (G,G') and any bilinear operation g : E'×F' → G' there is a unique linear map h: E'⊗F'→ G' such g = h⚬⊗. Indeed defining h by restriction of ^t(G ∋ z ↦ z⚬g) : (E⊠F)*→ G*, we get g = h⚬⊗ (why ?); then Im(h⚬⊗)⊂ G' implies Im h ⊂ G' because Im ⊗ generates E'⊗F' = Dom h, which also ensures the uniqueness of h.

The tensor product is more generally defined on any list of dual pairs: for example the tensor product of 3 dual pairs (E,E'),(F,F'),(G,G') is a dual pair (E⊗F⊗G, E'⊠F'⊠G') where essentially E⊗F⊗G=(E⊗F)⊗G=E⊗(F⊗G) is generated by E×F×G, and E'⊠F'⊠G' = (E'⊠F')⊠G'= E'⊠(F'⊠G') is a tensor space with order 3.
To justify these natural isomorphisms, note that our construction of a space of the form E'⊠F' is essentially unaffected when replacing E by one of its generating subsets, and the same for F. In particular, for any basis B⊂E, the space E' of all linear forms on E is identifiable as the space ℝ^B of all functions on B (without any more linearity condition), and the space E'⊠F' is essentially unchanged when replacing E by B.

Remains the issue of the difference between ⊗ and ⊠, which disappears with finite dimensional spaces.

g⊗f
〈x, y〉
〈x,z〉

Meaning of a tree-like tensor expression

A monomial tensor expression is a tree if between any 2 vertices there exists one and only one path that does not repeat any vertex (so as to forget paths that can be obviously shortened by cutting some edges). In other words it is a graph that :

is connected : it cannot be split without cutting some edge (existence of a path between any two vertices)
does not contain any loop (that give different paths between some vertices). In particular it does not contain any edge with its two ends at the same vertex (which is otherwise admissible in other tensor expressions).

Such an expression can be interpreted as an algebraic expression in as many ways as there are vertices, edges and free ends in the graph; and all these ways give the same result in the expected space.
Indeed, every vertex, or edge, or free end, can be chosen to be seen as the main symbol of the expression (if a vertex it is an operation with values in ℝ; if an edge it is a scalar product; if a free end it is an operation symbol with vectorial value); every edge is marked by the orientation of the unique path to this "main symbol", and every vertex is interpreted as the operation symbol that follows these orientations.
All these interpretations give the same result because between any two choices of main symbol there is a path, and the result is preserved at every step of this path (from each edge to each next edge), thanks to the identities between the different algebraic interpretations of each symbol (vertex).

Meaning of tensor expressions without loop

Any graph that is not connected is divided into several separate connected components in a unique way.

The result of such an expression is given by separately making the computation in each tree in the above way, then multiplying the results. This can either be seen as a multiplication between numbers, or between numbers and one vector, depending on the interpretation of the expression.
This possibility to interpret disconnected graphs can also be expressed as follows:

The injection from E⊗F to E⊠F

There is a natural map from E⊗F to E⊠F defined by mapping each (x,y) in E×F to the map sending each (x',y') in E'×F' to the real number obtained by multiplying in ℝ the scalar products : (x.x')(y.y'). Indeed for every (x,y) ∈ E×F, this map from E'×F' to ℝ has its currified forms defining x'↦(x.x')y from E' to F, and y'↦(y.y')x from F' to E, thus indeed belongs to E⊠F.

The rank

Every element t of a tensor product E⊗F has finite rank (in the sense of the rank of a matrix, to not confuse with the traditional use of this word for tensors, that we call here its arity or degree, here equal to 2), defined as the minimum number of elements of the form x⊗y (the rank 1 element) whose sum gives t.
This rank is also equal to the dimension of the image set of each of both maps that t defines as an element of E⊠F, from E' to F and from F' to E.

Proof: The image in F of the map defined by x₁⊗y₁+...+x_n⊗y_n from E' to F, is contained in the subspace of F generated by y₁,...,y_n, thus its dimension is no larger than n.
If the y₁,...,y_n were not linearly independent, they could be decomposed in another basis with a smaller number of elements (made of some of them), providing another expression of that element of E⊗F as a sum of a smaller number of rank 1 elements. The same for the x₁,...,x_n. The linear independence of the x₁,...,x_n ensures that the image is equal to the subspace of F generated by y₁,...,y_n, that has dimension n.

Note that the proof of this equality between expressions in E⊗F is processed in the classical concept of E⊗F (universal algebraic : as quotient of the set of formal combinations of elements of E'×F' by the relations in each of E' and F') rather than as a dual to E'⊠F'. From this we can deduce that the map from the classical E⊗F to E⊠F is injective, and therefore both definitions of E⊗F coincide. We shall identify E⊗F to its image in E⊠F.

Now let us directly define the rank of an element of E⊠F as the dimension of each of its images in E and in F (and thus cannot exceed the smallest of the dimensions of E and F); it is also the dimension of the dual vector spaces it defines in the role of a scalar product between (the quotiented) E' and F'.
An element t of E⊠F belongs to E⊗F if and only if its rank is finite. In this case, its two images A in E and B in F are in duality to each other by defining for every x ∈A and y∈B, their scalar product by x.y'=y.x' (= x'ty') for any elements y'∈F' and x'∈E' such that ty'=y and x't=x.
The choice of a basis x₁,...,x_n of A and its dual basis y₁,...,y_n of B gives a decomposition of t as t = x₁⊗y₁+...+x_n⊗y_n.

Thus E⊗F = E⊠F when E or F is finite dimensional, but generally not otherwise. Anyway both E⊗F and E⊠F are duals to E'⊗F' .

The identity element, the trace

Consider the case of F=E'. Then we have the natural element I in E⊠E' (also usually noted as δ (small delta) named "Kronecker symbol") that gives the scalar product itself (as a map from E'×E to ℝ) and behaves as the identity from E to itself, and the identity from E' to itself. Seen as a map from E'⊗E to ℝ, this element is also called the trace. Its rank equal to the common dimension of E and E', thus belongs to E⊗E' only if this dimension is finite.

Now if E has finite dimension then we have a basis of n vectors e₁ ... e_n, and its dual basis e'₁...e'_n, by which the element I can be written as the image in E⊠F of the element e₁⊗e'₁+...+e_n⊗e'_n of E⊗E'.
Let us apply the trace function to this element I of E⊗E' itself. This transforms each tensor product into a scalar product, thus giving (e₁.e'₁)+...+(e_n.e'_n)= 1+...+1 = n. This cannot be done in an infinite dimensional space, as the result would be infinite. So we have this rule : for every vector space, its dimension equals the trace of the identity in this space.

Meaning of any tensor expression without infinite-dimensional loop

Tensor expressions have the following properties, that we can verify in previous cases (without loop), and that will be postulated as general rules:

They are multilinear with respect to each symbol (distributive over addition, and scalar factors can be put outside):

If a tensor symbol x satisfies x=y+z then for each occurence of x in a tensor expression the result is the sum of those obtained by replacing this occurence of x with those of y and z.
For any scalar a, replacing an occurence of x by ax multiplies the result by a; isolated components of the graph mean such scalar multiplication.

Any subgraph of a graph distinguished by taking a subset of the set of vertices, can be replaced by a single symbol equal to the monomial expression defined by that subgraph. Thus when a symbol equals a linear combination of graphs, the whole equals the same linear combination where one occurence of this symbol is replaced by each graph in the combination.
Any edge can be replaced by a "long edge" trough the identity symbol.

We saw above (with the trace of the identity) that it is not generally possible to make sense of a tensor expression containing an infinite-dimensional loop, that is a loop (path in the graph that comes back to itself) where all edges are labelled with infinite-dimensional spaces, and vertices have infinite rank. (At least in an algebraic manner ; let us not mention constructions where they could be defined by infinite sums that converge to a limit, that could be used to generalize the concept).
But let us show that it makes sense in all other cases, that is

Whatever tensor expression only using finite-dimensional spaces
More generally, expressions that may contain infinite-dimensional spaces but that do not form any loop (every loop in the graph goes through some finite-dimensional space or some tensor of finite rank).

For this, choose a way to "cut the graph" at least once at some finite-dimensional part of every loop : either

at a vertex representing a tensor of finite rank, or
at an edge labelled by a finite-dimensional space, replaced by a "long edge" through the identity element that has finite rank.

As each of these vertices is replaced by a linear by a linear combination of disconnected graphs, this produces a big linear combination of graphs where all loops are cut (this combination does not depend on the order between vertices to which the decomposition is applied, thanks to the commutativity and associativity of addition). Thus a well-defined result according to the above constructions.

This result does not depend on the choice of decomposition. Indeed if you have 2 decompositions applied to edges (or at least that does not apply 2 different decompositions to the same vertex), then let us consider making both decompositions together (if an edge is decomposed two ways, let us see it as a long edge through 2 copies of I, and apply the decomposition to a different copy of I in both cases). Then we can verify that the result of the double decomposition equals that of each of both decompositions.

Another way of seeing it, is to consider that an element of E⊠F with finite rank is identified to an element of E⊗F, which is in duality with E'⊠F'. The use of this duality gives meaning to the expression.

This formalism provides the computation : dim(E⊗F)= (dim E) (dim F)

How transformation formulas are obtained

The expression of how components of tensors are transformed during changes of coordinates (or transformations of the composing spaces), can be naturally obtained by the following tools:
Any family of n vectors in a space E can be formalized as an element of ℝⁿ⊗E. It is a basis when it is inversible. In this case, its inverse, that belongs to ℝⁿ⊗E', defines the dual basis.
In many cases, the trick to avoid any risk of mistake is to introduce vector spaces with special names to label edges in the tensor expressions, such as "ℝⁿ as a representation of E in this basis", thus distinct from its dual "ℝⁿ as a representation of E' in the dual basis".

(to be continued)

Interesting papers

Kindergarten Quantum Mechanics is an application of the Penrose diagrams notation for tensors, to the case of quantum mechanics.