5  Index notation

5.1 Scalar products

For simplifying vector equations, there is a powerful method called index (or suffix) notation.

A vector equation such as \[ \cb = \ab + \bb \] is written in index notation as \[ c_i = a_i + b_i, \] where it is understood that the equation holds for \(i=1,\ldots,n\), meaning an arbitrary component in the standard (Cartesian) basis \(\{\eb_1,\ldots,\eb_n\}\).

In future courses you may encounter equations for tensors with more than one index, such as Einstein’s field equation from General Relativity, \[ G_{ij} + \Lambda g_{ij} = \kappa T_{ij}. \] [The expression on the left represents the curvature of spacetime, and that on the right the stress-energy-momentum content of spacetime.]

An index that appears once in each term is called a free index. The choice of letter is arbitrary, but all terms in an equation must have the same free index (or indices).

In index notation, we write the scalar product as \[ \ab\cdot\bb = a_jb_j, \] where the repeated index indicates that the term should be summed from \(j=1\) to \(n\). This implied summation is called the (Einstein) summation convention.

An index that appears exactly twice in a term is called a dummy index, and always implies a summation over that index. Again, the choice of letter is arbitrary.

Note that \(\ab\cdot\bb\) is a scalar, so there is no free index when we write it in index notation.

We write \(\ab\cdot\bb = a_jb_j\) but we have to choose a different letter for the index in \(\cb\cdot\db\), for example \(\cb\cdot\db=c_kd_k\). So \[ (\ab\cdot\bb)(\cb\cdot\db) = a_jb_jc_kd_k. \] Writing it out in full, this means \[ a_jb_jc_kd_k = \left(\sum_{j=1}^na_jb_j\right)\left(\sum_{k=1}^nc_kd_k\right) = (a_1b_1 + a_2b_2 + \ldots + a_nb_n)(c_1d_1 + c_2d_2 + \ldots + c_nd_n). \] We cannot write \(a_jb_jc_jd_j\) as this breaks the rules of index notation: you can’t use an index more than twice in a single term.

The index \(j\) is repeated, so we know that it is a dummy index (summed over). So \[ a_jb_ic_j = \left(\sum_{j-1}^na_jc_j\right)b_i = (\ab\cdot\cb)b_i. \] The index \(i\) is a free index, so we see that \(a_jb_ic_j\) is the \(i\) component of the vector \((\ab\cdot\cb)\bb\). We could write this as \[ [(\ab\cdot\cb)\bb]_i = a_jb_ic_j. \]

In Tip 5.2, we could not write \(a_jb_ic_j = (\ab\cdot\cb)\bb\), because this would be equating a scalar (one component) to a vector, which is meaningless nonsense.

Each side is a vector, so in index notation we should have a free index in every term, say \(i\), giving \[ u_i + (\ab\cdot\bb)v_i = |\ab|^2(\bb\cdot\vb)a_i. \] Now we introduce dummy indices for each scalar product, noting that \(|\ab|^2 = \ab\cdot\ab\), so \[ u_i + a_jb_jv_i = a_ja_jb_kv_ka_i. \] In particular, we needed two dummy indices on the right-hand side to avoid ambiguity.

Another way to think of the scalar product \(\ab\cdot\bb\) is as a quadratic form \[ \ab\cdot\bb = (a_1,a_2,a_3)\begin{pmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{pmatrix}\begin{pmatrix}b_1\\b_2\\b_3\end{pmatrix} = a_1b_1 + a_2b_2 + a_3b_3. \] In index notation, this may be written \[ \ab\cdot\bb = \delta_{ij}a_ib_j, \] where the components of the identity matrix are given by the Kronecker delta, defined as \[ \delta_{ij} = \begin{cases} 1 & \textrm{if $i=j$},\\ 0 & \textrm{if $i\neq j$}. \end{cases} \] This will often be useful when manipulating expressions in index notation, thanks to the following property.

Proposition 5.1 Multiplying an expression with free index \(i\) by \(\delta_{ij}\) is equivalent to replacing \(i\to j\).

Proof. Consider some expression \(a_i\) with free index \(i\) (which could contain additional dummy or even free indices). Then if we multiply by \(\delta_{ij}\), our index \(i\) becomes a dummy index, and \[ a_i\delta_{ij} = a_1\delta_{1j} + a_2\delta_{2j} + \ldots + a_n\delta_{nj}. \] Now \(j\) is the free index. If \(j=1\), then \[ a_i\delta_{i1} = a_1\delta_{11} + a_2\delta_{21} + \ldots + a_n\delta_{n1} = a_1(1) + a_2(0) + \ldots + a_n(0) = a_1. \] Similarly for any \(j\), we have \(a_i\delta_{ij} = a_j\).


Here \(\delta_{ij}\) has free indices \(i\), \(j\). So multiplying by \(\delta_{jk}\) is equivalent to replacing \(j\to k\) in \(\delta_{ij}\), giving \[ \delta_{ij}\delta_{jk} = \delta_{ik}. \] Note that both sides have the same free indices \(i\) and \(k\).

Again \(\delta_{ij}\) has free indices \(i\), \(j\). Since they both appear in \(\delta_{ji}\), multiplying by this can either replace \(j\to i\) or \(i\to j\) in \(\delta_{ij}\). The first gives \[ \delta_{ij}\delta_{ji} = \delta_{ii} = \delta_{11} + \delta_{22} + \ldots + \delta_{nn} = n, \] while the second gives the same answer, \[ \delta_{ij}\delta_{ji} = \delta_{jj} = n. \]

5.2 Vector products

In \(\Real^3\), we can write vector products in index notation by introducing the alternating tensor (or Levi-Civita symbol), \[ \epsilon_{ijk} = \begin{cases} 0 & \textrm{if any of $i,j,k$ are equal},\\ +1 & \textrm{if $(i,j,k)=(1,2,3), (2,3,1)$ or $(3,1,2)$},\\ -1 & \textrm{if $(i,j,k)=(1,3,2), (2,1,3)$ or $(3,2,1)$}. \end{cases} \] This object has 27 components, but only 6 of them are non-zero. You can check that \[ \epsilon_{ijk} = \epsilon_{jki} = \epsilon_{kij}, \] and that \[ \epsilon_{ijk} = -\epsilon_{jik}. \] This allows us to express the vector product in index notation as \[ [\ab\times\bb]_i = \epsilon_{ijk}a_jb_k. \] For example, the first component of the right-hand side is \[\begin{align*} \epsilon_{1jk}a_jb_k &= \cancel{\epsilon_{11k}}a_1b_k + \epsilon_{12k}a_2b_k + \epsilon_{13k}a_3b_k\\ &= \cancel{\epsilon_{121}}a_2b_1 + \cancel{\epsilon_{122}}a_2b_2 + \epsilon_{123}a_2b_3 + \cancel{\epsilon_{131}}a_3b_1 + \epsilon_{132}a_3b_2 + \cancel{\epsilon_{133}}a_3b_3\\ &= \epsilon_{123}a_2b_3 + \epsilon_{132}a_3b_2\\ &= a_2b_3 - a_3b_2. \end{align*}\] This agrees with the \(\eb_1\) component of \(\ab\times\bb\).

Note that this scalar triple product is a scalar, so there should be no free index.

We have \[\begin{align*} \ab\cdot(\bb\times\cb) &= a_i[\bb\times\cb]_i\\ &= a_i\epsilon_{ijk}b_jc_k\\ &= \epsilon_{ijk}a_ib_jc_k \quad \textrm{[order doesn't matter for scalars]}\\ &= \epsilon_{kij}a_ib_jc_k\\ &= c_k\epsilon_{kij}a_ib_j\\ &= c_i\epsilon_{ijk}a_jb_k \quad \textrm{[relabelling the dummy indices $(i,j,k)\to(j,k,i)$]}\\ &= c_i[\ab\times\bb]_i\\ &= \cb\cdot(\ab\times\bb). \end{align*}\] The final relabelling wasn’t strictly necessary, just aesthetically pleasing!

When simplifying more complicated expressions, there is a useful identity for the product of two Levi-Civita symbols that share one index in common.

Proposition 5.2 (The \(\epsilon\)-\(\delta\) identity.) \[ \epsilon_{ijk}\epsilon_{klm} = \delta_{il}\delta_{jm} - \delta_{im}\delta_{jl}. \]

Proof. The left hand side is \[ \epsilon_{ijk}\epsilon_{klm} = \begin{cases} +1 & \textrm{if $(klm)$ is an even permutation of $(ijk)$},\\ -1 & \textrm{if $(klm)$ is an odd permutation of $(ijk)$},\\ 0 & \textrm{if $(klm)$ is not a permutation of $(ijk)$}. \end{cases} \] So there are six possibilities: \[ \epsilon_{ijk}\epsilon_{klm} = \delta_{ik}\delta_{jl}\delta_{km} - \delta_{ik}\delta_{jm}\delta_{kl} + \delta_{il}\delta_{jm}\delta_{kk} - \delta_{il}\delta_{jk}\delta_{km} + \delta_{im}\delta_{jk}\delta_{kl} - \delta_{im}\delta_{jl}\delta_{kk}. \] Using \(\delta_{kk}=3\) and Proposition 5.1, we find \[\begin{align*} \epsilon_{ijk}\epsilon_{klm} &= \delta_{im}\delta_{jl} - \delta_{il}\delta_{jm} + 3\delta_{il}\delta_{jm} - \delta_{il}\delta_{jm} + \delta_{im}\delta_{jl} - 3\delta_{im}\delta_{jl}\\ &= \delta_{il}\delta_{jm} - \delta_{im}\delta_{jl}. \end{align*}\]


The \(i\) component of the left-hand side is \[\begin{align*} [\ab\times(\bb\times\cb)]_i &= \epsilon_{ijk}a_j[\bb\times\cb]_k,\\ &= \epsilon_{ijk}a_j\epsilon_{klm}b_lc_m\\ &= \epsilon_{ijk}\epsilon_{klm}a_jb_lc_m\\ &= (\delta_{il}\delta_{jm} - \delta_{im}\delta_{jl})a_jb_lc_m \quad \textrm{[using Proposition 5.2]}\\ &= \delta_{il}\delta_{jm}a_jb_lc_m - \delta_{im}\delta_{jl}a_jb_lc_m\\ &= a_j(\delta_{il}b_l)(\delta_{jm}c_m) - a_j(\delta_{jl}b_l)(\delta_{im}c_m)\\ &= a_jb_ic_j - a_jb_jc_i \quad \textrm{[using Proposition 5.1]}\\ &= (a_jc_j)b_i - (a_jb_j)c_i\\ &= (\ab\cdot\cb)b_i - (\ab\cdot\bb)c_i. \end{align*}\]

5.3 Derivatives

Recalling that index notation refers to the Cartesian components, we can infer from the Cartesian expressions \[ \grad f = \ddy{f_1}{x}\eb_1 + \ddy{f_2}{y}\eb_2 + \ddy{f_3}{z}\eb_z, \qquad \grad\cdot\fb = \ddy{f_1}{x} + \ddy{f_2}{y} + \ddy{f_3}{z}, \qquad \grad\times\fb=\begin{vmatrix} \eb_1 & \eb_2 & \eb_3\\ \displaystyle\ddy{}{x} & \displaystyle\ddy{}{y} & \displaystyle\ddy{}{z}\\ f_1 & f_2 & f_3 \end{vmatrix} \] that the gradient, divergence and curl are written in index notation as \[ [\grad f]_i = \ddy{f}{x_i}, \qquad \grad\cdot\fb = \ddy{f_j}{x_j}, \qquad [\grad\times\fb]_i = \epsilon_{ijk}\ddy{f_k}{x_j}. \]

This is a scalar expression, so has no free indices. We have \[\begin{align*} \grad\cdot(\grad\times\fb) &= \ddy{}{x_i}\left(\epsilon_{ijk}\ddy{f_k}{x_j}\right)\\ &= \epsilon_{ijk}\ddy{^2f_k}{x_ix_j}\\ &= \epsilon_{ijk}\ddy{^2f_k}{x_jx_i} \quad \textrm{[assuming the second derivatives are all continuous]}\\ &= \epsilon_{jik}\ddy{^2f_k}{x_ix_j} \quad \textrm{[relabelling $i\leftrightarrow j$]}\\ &= - \epsilon_{ijk}\ddy{^2f_k}{x_ix_j} \quad \textrm{[using antisymmetry $\epsilon_{jik}=-\epsilon_{ijk}$]}. \end{align*}\] Since this expression is equal to minus itself, it must be zero. [This will happen whenever you have a product of an expression that is antisymmetric in two indices, and one which is symmetric in the same two indices.]

Index notation then gives us a neat(er) way to complete the outstanding proof of product rules (v) and (vi) from Proposition 3.4:

(v) \(\grad\times(\fb\times\gb) \equiv (\gb\cdot\grad)\fb - (\fb\cdot\grad)\gb + \fb(\grad\cdot\gb) - \gb(\grad\cdot\fb)\).

(vi) \(\grad(\fb\cdot\gb) \equiv (\gb\cdot\grad)\fb + (\fb\cdot\grad)\gb + \fb\times(\grad\times\gb) + \gb\times(\grad\times\fb)\).

Proof (of Proposition 3.4 (v)). The \(i\) component of the left-hand side is \[\begin{align*} [\grad\times(\fb\times\gb)]_i &= \epsilon_{ijk}\ddy{}{x_j}(\epsilon_{klm}f_lg_m)\\ &= \epsilon_{ijk}\epsilon_{klm}\ddy{}{x_j}(f_l g_m)\\ &= (\delta_{il}\delta_{jm} - \delta_{im}\delta_{jl})\ddy{}{x_j}(f_l g_m) \quad \textrm{[by Proposition 5.2]}\\ &= (\delta_{il}\delta_{jm} - \delta_{im}\delta_{jl})\left(\ddy{f_l}{x_j}g_m + f_l\ddy{g_m}{x_j}\right)\\ &= \delta_{il}\delta_{jm}\left(\ddy{f_l}{x_j}g_m + f_l\ddy{g_m}{x_j}\right) - \delta_{im}\delta_{jl}\left(\ddy{f_l}{x_j}g_m + f_l\ddy{g_m}{x_j}\right)\\ &= \ddy{f_i}{x_j}g_j + f_i\ddy{g_j}{x_j} - \ddy{f_j}{x_j}g_i - f_j\ddy{g_i}{x_j} \quad \textrm{[by Proposition 5.1]}\\ &= \ddy{f_i}{x_j}g_j + f_i(\grad\cdot\gb) - (\grad\cdot\fb)g_i - f_j\ddy{g_i}{x_j}. \end{align*}\] Defining the vector operator \((\gb\cdot\grad)\fb\) as the vector with \(i\) component \(\displaystyle g_j\ddy{f_i}{x_j}\) then gives the result.


In fact, the operator \((\gb\cdot\grad)\fb\) may be defined like our other differential operators by the coordinate-free expression \[ (\gb\cdot\grad)\fb = \lim_{|V|\to 0}\frac{1}{|V|}\oint_S\fb(\gb\cdot\hat{\nb})\,dS. \] This operator is central to Fluid Mechanics, through the nonlinear term \((\ub\cdot\grad)\ub\) in the Navier-Stokes equation.

Proof (of Proposition 3.4 (vi)). This time the right-hand side contains “triple products”, so we start with that and simplify. We have \[\begin{align*} [\fb\times(\grad\times\gb)]_i &= \epsilon_{ijk}f_j\epsilon_{klm}\ddy{g_m}{x_l}\\ &= \epsilon_{ijk}\epsilon_{klm}f_j\ddy{g_m}{x_l}\\ &= (\delta_{il}\delta_{jm} - \delta_{im}\delta_{jl})f_j\ddy{g_m}{x_l}\\ &= \delta_{il}\delta_{jm}f_j\ddy{g_m}{x_l} - \delta_{im}\delta_{jl}f_j\ddy{g_m}{x_l} \quad \textrm{[by Proposition 5.2]}\\ &= f_j\ddy{g_j}{x_i} - f_j\ddy{g_i}{x_j} \quad \textrm{[by Proposition 5.1]}. \end{align*}\] So the \(i\) component of the full right-hand side of (vi) is \[\begin{align*} [\textrm{RHS}]_i &= g_j\ddy{f_i}{x_j} + f_j\ddy{g_i}{x_j} + f_j\ddy{g_j}{x_i} - f_j\ddy{g_i}{x_j} + g_j\ddy{f_j}{x_i} - g_j\ddy{f_i}{x_j}\\ &= f_j \ddy{g_j}{x_i} + g_j\ddy{f_j}{x_i}\\ &= \ddy{}{x_i}(f_j g_j) = [\grad(\fb\cdot\gb)]_i. \end{align*}\]


5.4 Second derivatives

For a scalar field \(g:\Real^3\to\Real\) or a vector field \(\fb:\Real^3\to\Real^3\), there are five ways to combine two derivatives to get a second derivative operator: \[ \grad\cdot(\grad g), \quad \grad\times(\grad g), \quad \grad(\grad\cdot\fb), \quad \grad\cdot(\grad\times\fb), \quad \textrm{or} \quad \grad\times(\grad\times\fb). \] However, we already know that \(\grad\times(\grad g)=\bfzero\) and \(\grad\cdot(\grad\times\fb)=\bfzero\), so there are only three non-trivial operators:

Second derivatives.

The Laplacian of a scalar field is the operator defined by \[ \nabla^2 g = \grad\cdot(\grad g). \] [Sometimes \(\grad^2g\) is written as \(\Delta g\).] It returns another scalar field.

The Laplacian is the differential operator appearing in the Laplace equation \(\nabla^2 g = 0\) (hence the name), as well as the heat and wave equations. More on these partial differential equations next term!

In Cartesian coordinates, we have \[ \nabla\cdot(\nabla g) = \ddy{}{x_j}\left(\ddy{g}{x_j}\right) = \ddy{^2g}{x^2} + \ddy{^2g}{y^2} + \ddy{^2g}{z^2}. \] So here \[ \nabla^2 g = \ddy{}{x}(3x^2 - 3y^2) + \ddy{}{y}(-6xy) = 6x - 6x = 0. \]

Proposition 5.3 In orthogonal curvilinear coordinates \((u,v,w)\), the Laplacian is given by \[ \nabla^2 g = \frac{1}{h_uh_vh_w}\left[\ddy{}{u}\left(\frac{h_vh_w}{h_u}\ddy{g}{u}\right) + \ddy{}{v}\left(\frac{h_uh_w}{h_v}\ddy{g}{v}\right) + \ddy{}{w}\left(\frac{h_uh_v}{h_w}\ddy{g}{w}\right)\right]. \]

Proof. Simply insert Proposition 3.2 into Proposition 3.5 (i).


Functions like this that solve \(\nabla^2 h=0\) are called harmonic functions.

If you study Complex Analysis II, you will know that both the real and imaginary parts of a differentiable complex function \(f(x+iy)=u(x,y) + iv(x,y)\) satisfy the Cauchy-Riemann equations \[ \ddy{u}{x} = \ddy{v}{y}, \quad \ddy{u}{y}=-\ddy{v}{x}. \] By differentiating you can show that \(\nabla^2 u = \nabla^2 v = 0\).

Next, consider the operator \(\grad(\grad\cdot\fb)\). Its \(i\) component is simply \[ [\grad(\grad\cdot\fb)]_i = \ddy{}{x_i}\left(\ddy{f_j}{x_j}\right). \]

Finally, consider the operator \(\grad\times(\grad\times\fb)\). Its \(i\) component is \[\begin{align*} [\grad\times(\grad\times\fb)]_i &= \epsilon_{ijk}\ddy{}{x_j}\left(\epsilon_{klm}\ddy{f_m}{x_l}\right)\\ &= \epsilon_{ijk}\epsilon_{klm}\ddy{}{x_j}\left(\ddy{f_m}{x_l}\right)\\ &= (\delta_{il}\delta_{jm} - \delta_{im}\delta_{jl})\ddy{^2 f_m}{x_j\partial x_l} \quad \textrm{[by Proposition 5.2]}\\ &= \ddy{^2f_j}{x_j\partial x_i} - \ddy{^2 f_i}{x_j\partial x_j}\\ &= \ddy{^2f_j}{x_i\partial x_j} - \ddy{^2 f_i}{x_j\partial x_j} \quad \textrm{[assuming continuous second derivatives]}\\ &= \ddy{}{x_i}\left(\ddy{f_j}{x_j}\right) - \ddy{^2 f_i}{x_j\partial x_j}\\ &= [\grad(\grad\cdot\fb)]_i - \grad^2 f_i. \end{align*}\]

If we define the vector Laplacian as the vector with Cartesian components \[ [\grad^2\fb]_i = \grad^2f_i, \] then we can write \[ \grad\times(\grad\times\fb) = \grad(\grad\cdot\fb) - \grad^2\fb. \]

Physically, \(\grad\times(\grad\times\fb) \neq \bfzero\) means that \(\grad\times\fb\) has a different value at different points in space.

For the velocity field \(\vb\) of a fluid – such as wind speed in the Earth’s atmosphere – the curl \(\grad\times\vb\) represents the local rotational flow (called vorticity in Fluid Mechanics). Think of a hurricane, or indeed just an Atlantic depression. Then \(\grad\times(\grad\times\vb)\) tells you how much this rotation varies from one place to another.

Firstly, compute the divergence and curl: \[\begin{align*} &\grad\cdot\fb = 1 + x\\ &\grad\times\fb = \begin{vmatrix} \eb_1 & \eb_2 & \eb_3\\ \displaystyle\ddy{}{x} & \displaystyle\ddy{}{y} & \displaystyle\ddy{}{z}\\ x & xy & y^3 \end{vmatrix} = 3y^2\eb_1 + y\eb_3. \end{align*}\] So the left-hand side is given by curling again: \[ \grad\times(\grad\times\fb) = \begin{vmatrix} \eb_1 & \eb_2 & \eb_3\\ \displaystyle\ddy{}{x} & \displaystyle\ddy{}{y} & \displaystyle\ddy{}{z}\\ 3y^2 & 0 & y \end{vmatrix} = \eb_1 - 6y\eb_3. \] For the right-hand side, we find \[ \grad(\grad\cdot\fb) = \grad(1+x) = \eb_1 \] and in Cartesian coordinates we have \[ \grad^2\fb = (\nabla^2 f_1)\eb_1 + (\nabla^2 f_2)\eb_2 + (\nabla^2 f_3)\eb_3. \] Here \[\begin{align*} &\nabla^2 f_1 = \ddy{^2f_1}{x^2} + \ddy{^2f_1}{y^2} + \ddy{^2f_1}{z^2} = \ddy{^2}{x^2}(x) = 0\\ &\nabla^2 f_2 = \ddy{^2f_2}{x^2} + \ddy{^2f_2}{y^2} + \ddy{^2f_2}{z^2} = \ddy{^2}{x^2}(xy) + \ddy{^2}{y^2}(xy) = 0\\ &\nabla^2 f_3 = \ddy{^2f_3}{x^2} + \ddy{^2f_3}{y^2} + \ddy{^2f_3}{z^2} = \ddy{^2}{y^2}(y^2) = 6y. \end{align*}\] Therefore \(\grad^2\fb=6y\eb_3\), so that \(\grad(\grad\cdot\fb) - \grad^2\fb = \eb_1-6y\eb_3\), matching the left-hand side.

Be very careful with \(\grad^2\fb\) in non-Cartesian coordinate systems! For example, in orthogonal curvilinear coordinates, \([\grad^2\fb]\cdot\eb_u \neq \grad^2 f_u\). To find the correct expression, you would rearrange and calculate \[ \grad^2\fb = \grad(\grad\cdot\fb) - \grad\times(\grad\times\fb). \] This shows, incidentally, that \(\grad^2\fb\) really is coordinate independent, since everything on the right-hand side is.