What are tensors?

Why index notation?

In the previous chapter, we wrote that vectors are basically first-order tensors. But vectors is also a notion used in linear algebra, where we may have an expression such as

vTAv\begin{aligned} \underline{\boldsymbol{ v}}^{\mathrm{T}} \underline{\boldsymbol{ A}} \underline{\boldsymbol{ v}} \end{aligned}

where v\underline{\boldsymbol{ v}} is a (column) vector and A\underline{\boldsymbol{ A}} is a matrix. But why do we need to transpose the first v\underline{\boldsymbol{ v}}? To make the dimensions match. But v\underline{\boldsymbol{ v}} only have one dimension, so why is this required?

Let's look at another example involving differentiation: How to evaluate the following

(uv)\begin{aligned} \nabla(\underline{\boldsymbol{ u}}\cdot\underline{\boldsymbol{ v}}) \end{aligned}

by using the chain rule, where \nabla is the gradient operator. This operation is well known for scalar problems, i.e

f(x,y,z)=[fx1,fx2,fx3]T\begin{aligned} \nabla f(x,y,z) = \left[\frac{\partial f}{\partial x_1}, \frac{\partial f}{\partial x_2}, \frac{\partial f}{\partial x_3}\right]^{\mathrm{T}} \end{aligned}

But if we want to expand the expression, we need to write something like

(uv)=vTu+uTv\begin{aligned} \nabla(\underline{\boldsymbol{ u}}\cdot\underline{\boldsymbol{ v}}) = \underline{\boldsymbol{ v}}^{\mathrm{T}} \nabla\underline{\boldsymbol{ u}} + \underline{\boldsymbol{ u}}^{\mathrm{T}} \nabla\underline{\boldsymbol{ v}} \end{aligned}

But what would taking the gradient of a vector mean? In vector calculus, we would need to define this, and it is sometimes denoted the Jacobian matrix of the vector field, defined as

u=[u1x1u1x2u1x3u2x1u2x2u2x3u3x1u3x2u3x3]\begin{aligned} \nabla\underline{\boldsymbol{ u}} = \begin{bmatrix} \frac{\partial u_1}{\partial x_1} & \frac{\partial u_1}{\partial x_2} & \frac{\partial u_1}{\partial x_3} \\ \frac{\partial u_2}{\partial x_1} & \frac{\partial u_2}{\partial x_2} & \frac{\partial u_2}{\partial x_3} \\ \frac{\partial u_3}{\partial x_1} & \frac{\partial u_3}{\partial x_2} & \frac{\partial u_3}{\partial x_3} \end{bmatrix} \end{aligned}

So when doing this with vectors and matrices, two important things happened:

  1. The order changed to perform the correct contractions

  2. We transposed vectors to make dimensions match

If we instead do this in index notation, where we, "once and for all", use the definition that in index notation i=xi\nabla_i = \frac{\partial }{\partial x_i}, we have that

i(ujvj)=(iuj)vj+uj(ivj)(ujvj)xi=ujxivj+ujvjxi\begin{aligned} \nabla_i (u_j v_j) &= (\nabla_i u_j) v_j + u_j (\nabla_i v_j)\\ \frac{\partial (u_j v_j)}{\partial x_i} &= \frac{\partial u_j}{\partial x_i} v_j + u_j \frac{\partial v_j}{\partial x_i} \end{aligned}

Then we see that we do not need to care about the order of the terms. We still have to ensure that the differentiation operates on the correct symbol, as it is also required if we use the product rule on scalars! Secondly, we do not have any transposition.

Similarly, if we consider the first example, if we interpret the matrix A\underline{\boldsymbol{ A}} as a second-order tensor, A\boldsymbol{ A} with indices AijA_{ij}, we can write viAijvj=vivjAij=Aijvivjv_i A_{ij} v_j = v_i v_j A_{ij} = A_{ij} v_i v_j. So we do not need to care about transposition, and we can move the terms around as we like. This possibility provides a major simplification when working with larger expressions, and especially for the differentiation as we just discussed.

Tensors

We have introduced index notation, vector algebra and why index notation is nice for vector and matrix calculations. So why do we actually need tensors and how are they different from regular vectors and matrices with the addon of index notation?

For vectors, the answer is that nothing is different. Vectors are first order tensors, and the index notation is just a way of representing them in a given coordinate system. We showed previously a vector v\underline{\boldsymbol{ v}} in an orthonormal basis system ei\underline{\boldsymbol{ e}}_i, described by its indices viv_i, such that v=viei\underline{\boldsymbol{ v}}=v_i\underline{\boldsymbol{ e}}_i. In the index notation, we saw that we could have objects of higher orders, such as aija_{ij}. So a straight forward extension to second-order tensors would then be a=aijeiej\boldsymbol{ a}=a_{ij}\underline{\boldsymbol{ e}}_i\otimes\underline{\boldsymbol{ e}}_j. And this is exactly how a 2nd order tensor is defined! But OK, perhaps not that straight-forward: what does \otimes imply here?

We call \otimes the open product. When we do a contraction, such as the dot (or scalar) product, \cdot, the order for the tensor reduces: The dot product between two vectors is a scalar (a tensor of order 0). The open product between two vectors is a 2nd order tensor. So the open product increases the order of the tensor. We could, for example, take the open product between two vectors in different basis systems: u=uiei\underline{\boldsymbol{ u}}=u_i\underline{\boldsymbol{ e}}_i and v=viEi\underline{\boldsymbol{ v}}=v_i\underline{\boldsymbol{ E}}_i. This becomes uv=uivieiEi\underline{\boldsymbol{ u}}\otimes\underline{\boldsymbol{ v}}=u_i v_i \underline{\boldsymbol{ e}}_i\otimes\underline{\boldsymbol{ E}}_i. The order in this product matter, e.g. uivjeiEj=eiEjuivj=vjuieiEjvjuiEjeiu_i v_j \underline{\boldsymbol{ e}}_i\otimes\underline{\boldsymbol{ E}}_j=\underline{\boldsymbol{ e}}_i\otimes\underline{\boldsymbol{ E}}_j u_i v_j = v_j u_i\underline{\boldsymbol{ e}}_i\otimes\underline{\boldsymbol{ E}}_j \neq v_j u_i\underline{\boldsymbol{ E}}_j\otimes\underline{\boldsymbol{ e}}_i. I.e., we can move the coefficients with indices around, but we can not change the order of the base vectors. In fact, the latter corresponds to taking the transpose of the 2nd order tensor. Even if ei=Ei\underline{\boldsymbol{ e}}_i=\underline{\boldsymbol{ E}}_i, the order in the open product must remain. So we see that in our operations, the coefficients (e.g. uiu_i and vjv_j) can be separated out from our open product, but the basis must remain in the correct order.

If we would like to take a dot product between a 2nd order tensor, a=aijeiej\boldsymbol{ a}=a_{ij}\underline{\boldsymbol{ e}}_i\otimes\underline{\boldsymbol{ e}}_j and a vector v=viei\underline{\boldsymbol{ v}} = v_i\underline{\boldsymbol{ e}}_i, we get

av=[aijeiej][vkek]=aijvkeiejek=aijvkeiδjk=aijvjei\begin{aligned} \boldsymbol{ a}\cdot\underline{\boldsymbol{ v}} = \left[a_{ij} \underline{\boldsymbol{ e}}_i\otimes\underline{\boldsymbol{ e}}_j\right] \cdot \left[v_k \underline{\boldsymbol{ e}}_k \right] = a_{ij} v_k \underline{\boldsymbol{ e}}_i\otimes\underline{\boldsymbol{ e}}_j \cdot \underline{\boldsymbol{ e}}_k = a_{ij} v_k \underline{\boldsymbol{ e}}_i \delta_{jk} = a_{ij} v_j \underline{\boldsymbol{ e}}_i \end{aligned}

So we need to maintain the order of operations, but since ejek=δjk\underline{\boldsymbol{ e}}_j \cdot \underline{\boldsymbol{ e}}_k = \delta_{jk}, this simplies the expression (because we work in an orthonormal coordinate system).

And the expression aijvja_{ij} v_j is just a simple summation that we do in index notation. As long as we are aware of the basis behind the scenes, we often skip writing out the basis because we assume that we work in the same orthonormal basis system. With the open product defined, we can define tensors of arbitrary dimensions, such as a fourth-order tensor as B= Bijkleiejekel\textbf{\textsf{ B}}=\textsf{ B}_{ ijkl} \underline{\boldsymbol{ e}}_i\otimes\underline{\boldsymbol{ e}}_j\otimes\underline{\boldsymbol{ e}}_k\otimes\underline{\boldsymbol{ e}}_l.

This was just a brief motivation for why tensors are easier to work with than vectors/matrices in the linear algebra sense - they are connected to a basis system in a proper way (matrices are not). Still, we can represent the components of first- and second-order tensors by regular vectors and matrices. This distinction can be hard in the beginning but it will become clearer with time.