2  Eigen function methods

2.1 A prototype problem and its solution via a function expansion

We start by motivating the method of eigenvalues through a class of highly prevalent P.D.E models. A very common class of models used in various areas of applied mathematics have the following form: \[ \ddy{u}{t} = L u + h(x,t). \tag{2.1}\] Here we consider a one-dimensional model of a scalar function \(u(x,t)\) on a domain \(x \in[a,b]\) and \(t>0\). There are numerous choices to be made in specifying this model:

2.1.1 The operator

The operator \(L\) is a linear operator, satisfying \[ L(u_1 + u_2) = L(u_1) +L(u_2). \] If it has derivatives they are only in the spatial variable \(x\). It can represent the spreading, transport and creation/destruction of the scalar \(u\).

For example the elliptic heat kernel: \[ L = \pder{}{x}{2} \Rightarrow \ddy{u}{t} = \pder{u}{x}{2} + h(x,t) \] or \[ L = \pder{}{x}{2} + v\ddy{}{x} + \lambda u \Rightarrow \ddy{u}{t} = \pder{u}{x}{2} + v\ddy{u}{x} +\lambda u + h(x,t) \] which is a model with a density \(u\) diffusing, maybe the density of a dissolved drug. The term \(v\ddy{}{x}\) could represent a fluid flow, maybe blood moving the drug density at a speed \(v\). Finally the term \(\lambda u\) could represent the creation degradation/creation of the drug at a rate \(\lambda\) (maybe enzyme action). We can have much more elaborate \(L\), just we need that its linear for what follows.

2.1.2 The “source” term \(h(x,t)\)

The function \(h(x,t)\), sometimes referred to as a forcing term, does not depend on the value of \(u\), so provides some external (to the system) change in the rate of production/loss of \(u\) (which is what \(\ddy{u}{t}\)) represents).

If \(u\) were representing a population it could represent immigration/migration, or in the diffusing drug example discussed above it could be an intravenous flow.

For example: \[ h(x,t) = \sin\left(\pi\frac{x-a}{b-a}\right)\cos(\omega t). \] Provides a forcing which is zero at the boundaries, and maximal in the middle of the domain. It varies from increasing \(u\) (when \(\cos(\omega t)>0\)) to decreasing \(u\) when (\(\cos(\omega t)>0\)), a change which oscillates at a rate \(\omega\).

If \(h(x,t)=0\) then the problem is referred to as homogenous. If it is non-zero the problem is referred to as inhomogeneous.

Inhomogeneous-homogenous decomposition

For every inhomogeneous problem, there is an associated homogenous problem obtained by setting \(h(x,t)=0\). If we label the respective solutions \(u_{ih}\) and \(u_h\) then \(u_h + u_{ih}\) is also a solution to the inhomogeneous problem (from linearity).

2.1.3 Boundary conditions

To make our problem concrete we need to supply \(n\) boundary conditions, with the \(n\) matching the spatial order of the operator i.e. we would need two boundary conditions in the examples above. We can write these as follows: \[ BC_i([u,a,b]=0. \] where the \(BC_i\) are linear operators of the functions evaluated at \(x=a,b\).

For example: \[ BC_1[u,a,b] = \ddy{u}{x}{\Large\vert}_{x=a} = 0,\quad BC_2[u,a,b] =\ddy{u}{x}{\Large\vert}_{x=b} = 0 \] But we could have more elaborate (but linear) choices: \[ BC_1[u,a,b] = \pder{y}{x}{2}{\Large\vert}_{x=a}+ 5u(b,t) = 0,\quad BC_2[u,a,b] = u(a,t)+u(b,t) - 10 = 0 \]

Boundary conditions are referred to a homogenous if the dependent variable \(u\), or its derivatives, are zero on the boundary (the first case above). Otherwise they are inhomogeneous (the second case).

Hereafter we will just write \(BC_i u\) to indicate the boundary operator \(BC_i\) acting on \(u\), and assume it is understood it is evaluated at the boundaries.

2.1.4 Initial conditions

Finally, this problem needs a start point, the initial condition. That is some value of the function \(u\) at \(t=0\): \[ u(x,0) = g(x). \]

2.1.5 Examples which aren’t the heat kernel

Just a few amongst many:

This section is not examinable, its just for fun and to give you some insight into how this model is used in practice.

The Bessel equation, for which:

\[ L = x^2 \pder{}{x}{2} + x\ddy{}{x}+(x^2-\nu^2) \] with \(\nu\) some real constant (often an integer but not always). This arises for systems represented in polar or cylindrical coordinates and represents the radial behaviour of those systems. It has been used famously for vibrating membranes (a polar coordinate system) and also the reverse pinch magnetic field configuration common in fusion plasma devices, a cylindrical model (see here for a nice introduction):

The Chebyshev operator \[ L = (1 - x^2) \pder{}{x}{2} - 2x \ddy{}{x} + n^2 = 0 \] This operator is associated (as we will see shortly) with the Chebyshev polynomials, a family of orthogonal polynomials widely used in various fields such as numerical analysis, physics, engineering, and machine learning. They are crucial for polynomial approximation, providing optimal solutions by minimizing the maximum error (Chebyshev approximation), and are used in Chebyshev interpolation to reduce errors in data fitting. In differential equations, Chebyshev polynomials enable efficient spectral methods for solving PDEs, while in control theory and signal processing, they contribute to filter design and optimization. They also play a role in root-finding methods, numerical integration, and quantum mechanics, due to their ability to approximate complex functions with high precision.

The surface flux transport model

The following equation:

\[ \ddy{u}{t} = D\nabla^2 u -\nabla\cdot\left({\bf v}u\right) +h({\bf x},t), \] is used to model the line of sign component of the Sun’s magnetic field at its surface (as a density \(u\)). This is technically a 2-D variant of our model but it fits the basic structure and we will see how to tackle 2-D variants later in the chapter, the method is near identical to the 1-D case. A video of a simulation (performed by your first term lecturer!) of the field is shown in the video below.

The field diffuses due to convective motion at the Sun’s surface. The velocity \({\bf v}\) represents various large scale motions in the solar interior (differential rotation and meridonal circulation) moving the field around systematically. Note how it leads to the stretching and shearing of the fields. Finally the source term \(h\) is the key to this model. It represents the emergence of new strong localised magnetic field, “active regions” which can lead to solar storms. You hopefully all saw the recent aurorae? They were caused by a monster active region emerging and firing giant balls of plasma towards the earth! The synoptic map in the video represents these observations which are injected into the simulation as they are observed.

The linear reaction diffusion system a coupled system of our model in the form: \[\begin{align} \ddy{u}{t} &= D_1\nabla^2 u + f_1 u + f_2 v,\\ \ddy{v}{t} &= D_2\nabla^2 v + g_1 u + g_2 v, \end{align}\] with \(D_1,D_2,f_1,f_2,g_1,g_2\) constants. This results as a linearisation (assume \(u\) and \(v\) are small) of the non-linear system: \[\begin{align} \ddy{u}{t} &= \nabla^2 u + f(u,v),\\ \ddy{v}{t} &= \nabla^2 v + g(u,v), \end{align}\] where \(f,g\) are non linear “reaction” functions (the’re not source terms like \(h\) as they depend on \(u,v\)). This model was used by Alan Turing to provide an explanation for formation of ordered patterns in nature, e.g. spots and stripes on animals, bacterial colonies and regular neuronal activity. The visual PDE story page about his seminal paper has more information: “on the chemical basis of morphogenesis”.

In this paper, the linear system is used (surprisingly successfully) to predict the number of spots/stripes (colonies) which form in the far more complex non-linear system. A fun example is shown in the video above where the original paper’s front page is give as an initial condition and a spotted pattern develops from it, using an example of Turing’s own system!

2.1.6 A solution

We will jump a number of steps ahead to see why solving this system can be translated into an eigenvalue problem. To get to this answer we will make a good number of unjustified assumptions, the aim of the rest of the chapter is to justify them.

We start by boldly assuming that there is some countably infinite set of functions \(y_n\) (a basis) such that the function \(u(x,t)\) can be written as \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)y_n(x). \] Further we assume it is possible to decompose the function \(h(x,t)\) using this basis: \[ h(x,t) = \sum_{n=0}^{\infty}h_n(t)y_n(x). \] If we substitute this into our model we obtain \[ \sum_{n=0}^{\infty}\ddt{c_n(t)}y_n(x) = \sum_{n=0}^{\infty}c_n(t)L y_n(x) +\sum_{n=0}^{\infty}h_n(t)y_n(x) \] (as \(L\) is a spatial variable operator only). Now we make some further assumptions, first that the functions \(y_n(x)\) satisfy the following equation: \[ Ly_n(x)=-\lambda_n y_n(x), \] with \(\lambda_n\) a constant. If we assume this then we can bring our sums together into one super-sum: \[ \sum_{n=0}^{\infty}\left[\ddt{c_n(t)} +\lambda_n c_n(t) - h_n(t)\right]y_n(x) = 0 \] For our next assumption we assume the \(y_n(x)\) are linearly independent, such that this sum can only be zero if each individual term of the sum is zero. That means we have to solve the following ordinary differential equations for each \(n\): \[ \ddt{c_n(t)} +\lambda_n c_n(t) - h_n(t) = 0 \] Remember, the only unknowns here are the functions \(c_n(t)\). If we can solve this ordinary differential equation for the \(c_n(t)\) then we have our solution to the original partial differential equation!: \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)y_n(x). \] The good news is that we can, its just a first order O.D.E which can be solved by the integrating factor method. The solution is \[ c_n(t) = \mathrm{e}^{-\lambda_n t}\left(a_n +\int_{1}^{t}\mathrm{e}^{\lambda_n t'}h_n(t')\rmd t'\right). \tag{2.2}\]

The homogenous solution in Equation 2.2 is: \[ a_n\mathrm{e}^{-\lambda_n t} \] The inhomogenous solution is: \[ \mathrm{e}^{-\lambda_n t}\int_{1}^{t}\mathrm{e}^{\lambda_n t'}h_n(t')\rmd t' \]

The constants of integration \(a_n\) are determined from the initial condition: \[ u(x,0) = \sum_{n=0}^{\infty}c_n(0)y_n(x) = g(x), \] again assuming we can write a function as a sum over the functions \(y_n\) which puts values to the constants \(c_n(0)\). But what about the boundary conditions? Well, from linearity and the fact \(L\) is a spatial operator we have, \[ BC_i u =\sum_{n=0}^{\infty}c_n(t)BC_i y_n = 0, \] so the functions \(y_n\) carry the boundary conditions: \[ BC_i y_n =0. \] This is where the linearity is important. Note: we have again assumed linear independence of the functions \(y_n\).

Lets summarise this

Summary of the proposed solution:

We have proposed that the solution to the problem \[ \ddy{u}{t} = L u + h(x,t), \] with \(n\) boundary conditions \[ BC_{i}[u]=0, \] takes the form: \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)y_n(x), \] where, the \(y_n\) satisfy the eigen-equation: \[ L y_n =-\lambda_n y_n, \] and each \(y_n\) satisfy the \(n\) boundary conditions: \[ BC_i y_n=0. \] Finally, the temporal functions \(c_n(t)\) take the form \[ c_n(t) = \mathrm{e}^{-\lambda_n t}\left(a_n +\int_{1}^{t}\mathrm{e}^{\lambda_n t'}h_n(t')\rmd t'\right), \] where \[ h(x,t) = \sum_{n=0}^{\infty}h_n(t)y_n(x). \]

PDE’s are hard, we like ODE’s

In short, by seeking the functions \(y_n\) we have reduced a partial differential equation to a series of ordinary differential equations.

2.1.7 A concrete example via the Fourier series

This has become quite abstract. I keep making assumptions about the functions \(y_n\), their linear independence, and their seeming ability to represent fairly arbitrary functions. Before I show this can all be justified (at least in a large range of cases) lets revisit an example which you have actually seen before.

If we choose the operators \[ L = \pder{}{x}{2},\quad BC_1[u] = \ddy{u}{x}{\Large\vert}_{x=a}\quad = BC_2[u] = \ddy{u}{x}{\Large\vert}_{x=b}=0, \] then he solution requires we solve the eigen-problem: \[ \deriv{y_n}{x}{2} = -\lambda_n y,\quad \dds{y_n}{x}{\Large\vert}_{x=a} =0,\quad \dds{y_n}{x}{\Large\vert}_{x=b}. \] Its solution is easily seen to be \[ y_n = a_n \cos\left(\frac{n\pi(x-a)}{b-a}\right),\quad \lambda_n = \frac{n^2\pi^2}{(b-a)^2} \] for arbitrary integers \(n\).

A reminder

The argument in the previous section states that the proposed solution to the P.D.E \[ \ddy{y}{t} = \deriv{y}{x}{2} + h(x,t). \] takes the from \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)\cos\left(\frac{n\pi(x-a)}{b-a}\right). \]

Try it yourself

Question 1 of the tutorial questions asks you to perform a Fourier Series type calculation, try it to dust off your Fourier skills!

You should recognise, for a fixed \(t\), that this is just a Fourier series. We know we can represent any reasonable function \(h(x,t)\) using the standard formula \[ h(x,t) = \sum_{n=0}^{\infty}h_n(t) \cos\left(\frac{n\pi(x-a)}{b-a}\right),\quad h_n(t) = \frac{l}{b-a}\int_{a}^{b}h(x,t)\cos\left(\frac{n\pi(x-a)}{b-a}\right)\rmd{x}. \] The same is true of the initial condition: \[ g(x) = \sum_{n=0}^{\infty}c_n(0)\cos\left(\frac{n\pi(x-a)}{b-a}\right),\quad c_n(0) = \frac{l}{b-a}\int_{a}^{b}g(x)\cos\left(\frac{n\pi(x-a)}{b-a}\right)\rmd{x}, \] where \(l=2\) if \(n>0\) or \(l=1\) if \(n=0\). These expansions give us the values required to set the \(a_n\) in Equation 2.2.

Fourier series rapid recap

From Calculus 1 we know that arbitrary (square integrable) function \(f(x),x\in[a,b]\) can be represented in the form: \[ f(x) = \sum_{n=0}^{\infty} a_n\cos \left(\frac{n\pi(x-a)}{b-a}\right) + b_n\sin\left(\frac{n\pi(x-a)}{b-a}\right). \tag{2.3}\] where (assuming \(n>0\)):

\[\begin{align} a_n &= \frac{2}{b-a}\int_{a}^{b}f(x)\cos\left(\frac{n\pi(x-a)}{b-a}\right)\rmd{x},\\ b_n &= \frac{2}{b-a}\int_{a}^{b}f(x)\sin\left(\frac{n\pi(x-a)}{b-a}\right)\rmd{x} \end{align}\]

This relies on the orthogonality relations (assuming \(n\) and \(m\) are \(>0\)):

\[\begin{align} &\frac{2}{b-a}\int_{a}^{b}\cos\left(\frac{n\pi(x-a)}{b-a}\right)\cos\left(\frac{m\pi(x-a)}{b-a}\right)\rmd{x} = \delta_{nm},\\ &\frac{2}{b-a}\int_{a}^{b}\sin\left(\frac{n\pi(x-a)}{b-a}\right)\sin\left(\frac{m\pi(x-a)}{b-a}\right)\rmd{x} = \delta_{nm},\\ &\frac{2}{b-a}\int_{a}^{b}\sin\left(\frac{n\pi(x-a)}{b-a}\right)\cos\left(\frac{m\pi(x-a)}{b-a}\right)\rmd{x} = 0. \end{align}\]

So e.g.: \[ \frac{2}{b-a}\int_{a}^{b}f(x)\cos\left(\frac{n\pi(x-a)}{b-a}\right) = a_n, \] obtained on substituting in Equation 2.3 for \(f(x)\).

I’ll ask you to look up yourself what the deal is with the \(n=m=0\) case….

A first very minor spanner in the works.

The boundary conditions in the problem at hand meant we couldn’t have the \(\sin\) parts of the Fourier series. But that places a restriction on \(g(x)\). We have proposed it can be represented as a sum of cosines, but that means that it must satisfy the boundary conditions itself: \(\dds{g}{x} = 0\) on \(x=0,L\).

This is not a massive problem and very often it make physical sense for the forcing to satisfy the same boundary conditions as \(u\), after all it is trying to force the behaviour of \(u\).

So we can now set all our constants in the equation and we have a complete form for \(c_n(t)\).

For example if we make our lives simple and assume \(a=0,b=1\) and: \[ h(x,t) = \sin(\omega t)\cos\left(3\pi x\right),\quad g(x) = x-1/2. \] Then all \(h_n(t)=0\) except \(h_3(t) =\sin(a t)\) and: \[ c_n(0) = 2\int_{0}^{1}(x-1/2)\cos\left(n\pi x\right)= \frac{2(\cos(n\pi)-1)}{n^2\pi^2}. \] for \(n>0\) and \(c_0(0)=0\).

Thus if \(n\neq 3\) we have \[ c_n(t) = c_n(0)\mathrm{e}^{-n^2\pi^2 t}, \] where the \(c_n(0)\) are as given above, and hence all the modes decay exponentially. In the special case \(n=3\) \[\begin{align} c_3(t) &= \mathrm{e}^{-\lambda_3 t}\left(a_n +\int_{1}^{t}\mathrm{e}^{\lambda_3 t'}\sin(\omega t')\rmd t'\right),\\ &= \mathrm{e}^{-\lambda_3 t}\left(a_n +\left[\frac{e^{\lambda_3 t} (\lambda_3 \sin (\omega t)-\omega \cos (\omega t))}{\omega^2+\lambda_n^2}\right]_1^t\right),\\ &=\ \mathrm{e}^{-\lambda_3 t}\left(a_n +\frac{e^{\lambda_3 } (\omega \cos (\omega)-\lambda_3 \sin (\omega))+e^{\lambda_3 t} \left(\lambda_3 \sin \left(\omega t\right)-\omega \cos \left(\omega t\right)\right)}{\omega^2+\lambda_3^2}\right) \end{align}\] with \(a_n\) set by the value of \(c_n(0)\). Which looks a little intimidating! The crucial point here is the homogenous part (featuring the \(a_n\)) decays away exponentially. Whilst the inhomogeneous part does not. When \(t\) is large we have \[ c_3(t)\approx \frac{(\omega \cos (\omega)-\lambda_3 \sin (\omega))+ \left(\lambda_3 \sin \left(\omega t\right)-\omega \cos \left(\omega t\right)\right)}{\omega^2+\lambda_3^2}. \] Implying the forcing term takes over an causes the \[ \cos\left(3\pi x\right) \] to have an sinusoidally varying amplitude (of rate \(\omega\)). Note the initial condition has essentially vanished from the solution as the forcing takes over.

Orthogonality and the coefficient formula as key

This particular example, which showed our proposed scheme can work, relied on two facts: 1. The orthogonality of the Fourier series, which ensures linear independence of the series representation. 2. That for any function \(f(x)\) we have a formula for the coefficients of the Fourier series: \[ \frac{l}{b-a}\int_{a}^{b}f(x)\cos\left(\frac{n\pi(x-a)}{b-a}\right)\rmd{x}, \] or \[ \frac{l}{b-a}\int_{a}^{b}f(x)\sin\left(\frac{m\pi(x-a)}{b-a}\right)\rmd{x}. \] The aim of what follows is to show when we can obtain the same two properties for the solutions to \[ L y_n = \lambda_n y_n \] for more general linear \(L\).

2.2 Analysis of the eigenfunction problem.

So, we now focus on the eigenfunction problem: \[ Ly_i(x)=\lambda_iy_i(x), \] along with some specified boundary conditions.

The absence of the minus sign, compared to the form of eigenfunction equation used above, is just a matter of convenience. Here it will be more convenient notationally to have it be positive. None of the conclusions we draw are affected either way.

Here \(y_i\) is an eigenfunction with corresponding eigenvalue \(\lambda_i\). This is analogous to the linear algebra eigenproblem

\[ \mathbf{A}\vec{x}_i=\lambda_i\vec{x}_i \] where \(\mathbf{A}\) is a matrix and \(\vec{x}_i\) an eigenvector with eigenvalue \(\lambda_i\), and, as we shall see this is reflected in the terminology and concepts we use to explore the properties of the eigenfunction equation.

2.2.1 Function spaces

In the same way as linear algebra utilises vector spaces, with linear differential operators we shall think of function spaces. Consider the infinite dimensional space of all reasonably well-behaved functions on the interval \(a\leq x\leq b\).

We recall that for vectors we used the dot product (an inner product) which allowed us to say when vectors were orthogonal and hence independent. For functions we will use an integral version of this:

Definition 2.1 The inner product of a pair of functions \(u(x),v(x)\) defined which are suitably well behaved on the domain \(x \in [a,b]\) to be the following integral product: \[ \langle u, v\rangle=\int_a^bu(x)\overline{v(x)}\;dx. \] Here the overbar denotes complex conjugate.

In this course, we will rarely be concerned with complex valued functions. If it is clear that we are dealing with real functions, we may drop the overbar for simplicity.

2.2.2 Weighting functions

In some instances, the eigenvalue problem and the inner product definition might include a given weighting function \(\rho(x)\), required to be real and not change sign on \(a\leq x\leq b\). In this case, the relations become \[ Ly_i(x)=\lambda_i\rho(x)y_i(x) \] and \[ \langle u, v\rangle=\int_a^b\rho(x)u(x)\overline{v(x)}\;dx. \]

2.2.3 Adjoint operators

We also require the notion of the adjoint of an operator (again analogous to the adjoint matrix in the finite dimensional case).

Definition 2.2 For operator \(L\) with BC associated with an inner product \(\langle \rangle\) the adjoint operator (\(L^*\),\(BC*\)) is defined by the inner product relation: \[ \langle Ly, w \rangle=\langle y, L^*w \rangle. \]

To determine the adjoint, one needs to move the derivatives of the operator from \(y\) to \(w\), and define adjoint boundary conditions so that all boundary terms vanish.

Tip 2.1: Find the adjoint operator \(L^*\) assocuated with the operator \(Ly = \frac{\mathrm{d}^2 y}{\mathrm{d}x^2}\) with boundary coditions \(y(a) = 0\) and \(y'(b) -3y(b) = 0\).

We seek a function \(L^*w\) such that: \[ \int_a^b (w)(y'')\mathrm{d}x = \int_a^b (y)(L^*w)\mathrm{d}x \] To do this, we need to shift the derivatives from \(y\) to \(w\) using integration by parts:

\[\begin{align} \int_a^b wy''\mathrm{d}x & = wy' \vert_a^b - \int_a^b w'y'\mathrm{d}x\\ & = wy' - w'y\vert_a^b +\int_a^b y w'' \mathrm{d}x. \end{align}\]

A comparison of the integral parts suggests: \[ L^*w = \frac{\mathrm{d}^2w}{\mathrm{d}x^2} \Rightarrow L^* = \deriv{}{x}{2}. \] The inner product only includes integral terms, so the boundary terms must vanish, which will define boundary conditions on \(w\), i.e. this defines BC*. Here, we require \[ w(b)y'(b) - w'(b)y(b) - w(a)y'(a) +w'(a)y(a)=0. \] Using the BC’s \(y'(b) = 3y(b)\) and \(y(a) = 0\), gives: \[ 0 = y(b)\Big(3w(b) - w'(b)\Big) - w(a)y'(a) +\underbrace{w'(a)y(a)}_{= 0} \] As these terms need to vanish for all values of \(y(b)\) and \(y'(a)\), we can infer two boundary conditions on \(w\):

  1. \(y(b)\): \(3w(b)-w'(b) = 0\)
  2. \(y'(a)\): \(w(a) = 0\)

so all told the answer to our problem is:

Tip 2.2: Find the adjoint operator \(L^*\) associated with the operator \(Ly = \frac{\mathrm{d}^2 y}{\mathrm{d}x^2}\) with boundary conditions \(y(a) = 0\) and \(y'(b) -3y(b) = 0\).

Ans: \[ L^* = \deriv{}{x}{2},\quad w(a) = 0,\quad 3w(b)-w'(b) = 0. \]

If \(L = L^*\) and \(BC=BC^*\) then the problem is self-adjoint. If \(L=L^*\) but \(BC\neq BC^*\) we still call the operator self-adjoint.

Some books use the terminology formally self-adjoint if \(L=L^*\) and fully self-adjoint if both \(L = L^*\) and \(BC=BC^*\).

Try it yourself

In question 2 of your tutorial sheet you can have a go at turning a non-self adjoint operator into an adjoint one, by forcing the boundary conditions to be homogenous with a transformation of the dependent variable.

2.3 Eigenfunction Properties

As discussed in section 2.1, we aim to construct solutions to (linear) partial differential equations as a linear combination of eigenfunctions. There are two fundamental properties of eigenfunctions that will be vital to this approach

Proposition 2.1 Eigenfunctions of the adjoint problem have the same eigenvalues as the original problem

That is, \(Ly=\lambda y\Rightarrow \exists w\) s.t \(L^*w=\lambda w.\)

Proof. From the definition of the inner product: \[ \langle Ly, w \rangle=\langle \lambda y, w \rangle = \langle y, \lambda w \rangle \] but also from the adjoint definition. \[ \langle Ly, w \rangle =\langle y, L^*w \rangle \] Comparing these two equalities we find: \[ \langle y, L^*w \rangle= \langle y, \lambda w \rangle. \] So \(L^*w =\lambda w\).

Proposition 2.2 Eigenfunctions corresponding to different eigenvalues are orthogonal

That is, if \(Ly_j = \lambda_j y_j\) (so \(L^*w_j = \lambda_j w_j\)) and \(Ly_k= \lambda_k y_k\) (\(L^*w_k = \lambda_k w_k\)), then for \(\lambda_j \neq \lambda_k\), \(\langle y_j, w_k \rangle = 0\).

Proof. \[\begin{align} \lambda_j \langle y_j, w_k \rangle& = \langle \lambda_j y_j, w_k \rangle\\ & = \langle Ly_j, w_k \rangle\\ & = \langle y_j, L^*w_k \rangle\\ & = \langle y_j, \lambda_k w_k \rangle\\ & = \lambda _k \langle y_j, w_k \rangle. \end{align}\]

But \(\lambda_j \neq \lambda_k\) so \(\langle y_j, w_k \rangle = 0\).

The proof here is exactly as for matrix problems where the inner product represents the dot product of vectors.

We have the first of our requirements, orthogonality!

Do they exist. We should be careful with these proofs. They work if the \(w_k\) are trivial (zero functions). In this case the statements become meaningless but technically correct. We haven’t (and can’t generally) show the \(w_k\) (or indeed the \(y_k\)) exists non-trivially. We will pursue this issue later in section 2.5. For now it suffices to say there is a large class of important problems (choices of \(L\)) for which both \(y_k\) and \(w_k\) exist non-trivially and hence this orthogonality is meaningful (it will allow us to solve our PDE problem)


2.3.1 Inhomogeneous problems, representing a function in terms of an operator.

We are in a position to outline the construction of solution to the BVP \[Lu=f(x)\] with linear, homogeneous, separated boundary conditions, denoted \(BC_1(a)=0\), \(BC_2(b)=0\).

But first, why do we care? There are two reasons

  1. If say \(L = \deriv{}{x}{2}\) then this is Poisson’s equation. We recall in the P.D.E classification that elliptic equations often arise as parabolic equations with the time derivatives set to zero. Our proposed model problem reduces to this form if we set \(h(x,t)\equiv -f(x)\) and \(\ddy{u}{t}=0\). This is an example of a steady state or equilibrium, a state to which the system should eventually settle.

  2. This actually addresses the issue of representing the function \(h(x,t)\) which we assumed earlier. We see this via the following steps.

\[\begin{align} &\sum_{n=0}^{\infty}h_n(t)y_n(x) = h(x,t) & ,\\ &\sum_{n=0}^{\infty}h_n(t)\frac{L y_n(x)}{\lambda_n} = h(x,t) & Ly_n = \lambda_n y_n\\ &L \sum_{n=0}^{\infty}h_n(t)\frac{y_n(x)}{\lambda_n} = h(x,t) & \mbox{Linearilty of }L. \end{align}\]

If we fix \(t = t_e\) and write \[ u = \sum_{n=0}^{\infty}h_n(t_e)\frac{y_n(x)}{\lambda_n},\quad f = h(x,t_e). \] then we have an equation in the form \(L u = f\)!

Note the potential issue here if \(\lambda_n=0\), we will come back to this later when we discuss the Fredholm alternative.

Seeking to solve the equation \(Lu =f\) is to answer the question: Can we represent a (fairly arbitrary) function as an infinite sum of the eigenvalues of an operator \(L\)? A fundamental assumption in the solution of our target problem laid out in the first section of this chapter.

To see we can solve the problem we proceed via the following steps.

  1. Solve the eigenvalue problem \[Ly=\lambda y,\quad BC_1(a)=0,\;BC_2(b)=0\] to obtain the eigenvalue-eigenfunction pairs \((\lambda_j, y_j)\).
  2. Solve the adjoint eigenvalue problem \[L^*w=\lambda w,\quad BC_1^*(a)=0,\;BC_2^*(b)=0\] to obtain \((\lambda_j, w_j)\).
  3. Assume a solution to the full system \(Ly=f(x)\) of the form \[y=\sum_ic_iy_i(x).\] To determine the coefficients \(c_i\), start from \(Ly=f\) and take an inner product with \(w_k\):

\[ \begin{split} Ly&=f(x)\\ \Rightarrow\langle Ly, w_k \rangle &= \langle f, w_k \rangle\\ \Rightarrow\langle y, L^*w_k \rangle &= \langle f, w_k \rangle\\ \Rightarrow\langle y, \lambda_k w_k \rangle &= \langle f, w_k \rangle\\ \Rightarrow\lambda_k\langle \sum_ic_iy_i, w_k \rangle &= \langle f, w_k \rangle\\ \Rightarrow\lambda_kc_k\langle y_k, w_k \rangle &= \langle f, w_k \rangle \end{split} \tag{2.4}\] 4. We can solve the last equality for the \(c_k\), and we are done!

We have the second of our requirements, a coefficient formula!

Note that in the last step we have used convergence of \(y\) to switch the order of summation and integration, and used the orthogonality property \(\langle y_j, w_k\rangle=0,\;\;j\neq k\).

If the problem is full self-adjoint then \(y_k=w_k\). We can then go as far as to say our eigen-functions \(y_k\) themselves are orthogonal and.

\[ \lambda_k c_k\langle y_k, y_k \rangle = \langle f, y_k \rangle. \] This was the recipe used in our example in section 2.1.7. where \[ \langle f, y_k \rangle = \int_{a}^{b}\cos\left(\frac{n\pi (x-a)}{b-a}\right)f(x) d x. \] and \[ \langle y_k, y_k \rangle = \int_{a}^{b}\cos^2\left(\frac{n\pi (x-a)}{b-a}\right) d x = \frac{b-a}{2}. \] But generically we need to use the adjoint identity.

2.3.2 Some simple solutions

Note that this construction requires that we are able to determine the eigenvalues/eigenfunctions in the first place. This is by no means guaranteed. But it will be useful to recall some simple cases, solvable using techniques you’ve learned before:

Sln: \[ am^2 +bm +(c -\lambda) = 0 \]

Then:

  1. Find roots \(m_i\) of the quadratic.
  2. The general solution is: \(y = A_1\mathrm{e}^{m_1 x} +A_2\mathrm{e}^{m_2 x}\). But note there are 3 unknowns: \(A_1\), \(A_2\), and \(\lambda\), while for a second order equation there will only be two BC’s.
  3. Apply first BC to relate \(A_1\) and \(A_2\).
  4. Apply second BC to determine values for \(\lambda\).

Sln. We find. \[ am(m-1) +bm +(c -\lambda) = 0 \] Then \(y = A_1 x^{m_1} + A_2 x^{m_2}\), and repeat the steps above.

Try it yourself

Flex your eigen-muscles with an example of both these types in questions 3-6 of the tutorial sheet.

Tip 2.5: Bessel’s equation

\[ Ly \equiv x^2y'' +xy' + \left(x^2-\nu^2\right)y = \lambda y\]

The solution to the Bessel problem is not a topic we will consider examples for in this course. This is because it something you have already seen: the series solution method which was covered in Epiphany term of Calculus 1 (including the Frobineus method). In this case (bounded) solutions take the form: \[ y_(\alpha) = \sum_{n=0}^{\infty}\frac{(-1)^n}{n!\Gamma(n+\alpha+1)}\frac{x}{2}^{2m+\alpha} \]

Lots of the classic examples of \(L\) one might consider in this chapter require a series solution method. Its very fiddly and you’ve already seen it so I don’t want to spend time on examples when there are lots of other new things to teach you.

2.3.3 inhomogeneous boundary conditions.

In the above construction we assumed homogeneous boundary conditions for which it is usually relatively easy to find the adjoint (here I mean no constants in the BC’s). We now consider the general case of an inhomogeneous system with inhomogeneous boundary conditions,

\[ L u =f(x),\quad B_iu =\gamma_i, i\in 1,2\dots n \tag{2.5}\]

When the boundary conditions only feature the function, say \(y_k(a)\) or its derivatives \(y_k'(b)\), it is straightforward to make the boundary terms arising from integration by parts vanish. This, as we saw earlier, is necessary to satisfy the adjoint condition: \[ \left<L y,w\right> = \left<y,L^*w\right> \] If a constant is in there it can be trickier.

A useful technique instead is to exploit the linearity of the split the system in two, i.e. solve both

\[ Lu_1=f(x),\quad B_iu_1=0 \tag{2.6}\] and \[ Lu_2=0,\quad B_iu_2=\gamma_i. \tag{2.7}\]

Here, solving for \(u_1(x)\) has the difficulty of the forcing function but with zero BC’s while the other equation is homogeneous but has the non-zero BC’s. Due to linearity, it is easy to see that \(u(x)=u_1(x)+u_2(x)\) solves the full system Equation 2.5

This decomposition can always be performed (although, as we shall see later, it requires caution if there is a zero eigenvalue \(\lambda=0\)), and since solving Equation 2.7 tends to be an easier matter (for linear systems!), it is safe for us to primarily focus on the technique of solving the system Equation 2.6, i.e. homogeneous boundary conditions.

For completeness it is worth noting that one can solve BVPs with inhomogeneous BC using an eigenfunction expansion and without needing a decomposition. The keys are:

  1. The eigenfunctions are always determined using homogeneous boundary conditions. Thus, eigenfunctions won’t change whether you ``decompose” or not. The difference comes in:
  2. In going from Line 2 to 3 of Equation 2.4, care must be taken in the integration by parts, as boundary terms will generally still be present. (Can you see why?) These extra boundary terms then carry through to the formula for the \(c_k\).

Let’s try an example to get a feel for this.

Tip 2.6

Let \(y'' = f(x)\) with \(0 \leqslant x \leqslant 1\), \(y(0) = \alpha\) and \(y(1) = \beta\).

BC’s Incorporated Solution Route

  1. Solve \(y'' = \lambda y\), with \(y(0) = 0\) and \(y(1) = 0\). We get \(y_k(x) = \sin(k\pi x)\) and \(\lambda_k = -k^2 \pi^2\) with \(k = 1,2,3,\ldots\).

2.The problem is self-adjoint (show this as an exercise), so \(w_k = y_k = \sin(k\pi x)\) and . So \[\begin{eqnarray*} y'' & = & f(x)\\ \int_0^1 w_k y'' \mathrm{d}x & = &\int_0^1 w_k f \mathrm{d} x\\ \Rightarrow (y'w_k-yw_k')\vert_0^1+\int_0^1 w''_k y \mathrm{d} x & = & \int_0^1 w_k f \mathrm{d} x\\ \Rightarrow (y'w_k-yw_k')\vert_0^1+ \lambda_k \int_0^1 w_k y \mathrm{d} x & = & \int_0^1 w_k f \mathrm{d} x\\ \Rightarrow (y'w_k-yw_k')\vert_0^1+ \lambda_k c_k \int_0^1 w_k y_k \mathrm{d} x & = & \int_0^1 w_k f \mathrm{d} x\\ \Rightarrow (y'w_k-yw_k')\vert_0^1- k^2\pi^2 c_k \int_0^1 \sin^2 (k \pi x)\mathrm{d} x & = & \int_0^1 w_k f \mathrm{d} x \end{eqnarray*}\]

  1. Now \(\int_0^1\sin^2(k\pi x)\;dx=1/2\), and \(w_k=\sin(k\pi x)\), hence \[\begin{eqnarray*} && y'w_k - yw_k' \vert_0^1 = - k\pi\cos(k\pi)y(1) +k\pi \cos(0)y(0)\\ && \Rightarrow -\beta k\pi(-1)^k+\alpha k\pi- \frac{1}{2}k^2\pi^2 c_k = \int_0^1 f(x)\sin k\pi x \mathrm{d}x \end{eqnarray*}\]

  2. Solving for \(c_k\) gives us \(y(x)\) as a Fourier series.

Decomposed Solution Route

  1. Solve two systems separately: \[\begin{eqnarray*} &&y''=f(x), \quad y(0)=y(1)=0\\ &&u''=0,\quad u(0)=\alpha, \;u(1)=\beta \end{eqnarray*}\]
  2. To solve for \(y\), since BC=0 we can jump straight to the formula \[ c_k=-\frac{\left< f, w_k\right>}{\lambda_k\left< y_k, w_k \right>}=-\frac{2\int_0^1f(x)\sin(k\pi x)\;dx}{k^2\pi^2} .\]
  3. The solution for \(u\) is easily obtained as \[ u=(\beta-\alpha)x+\alpha \]
  4. The full solution is \(y(x)+u(x)\).

Although they look different, both approaches give the same solution. Either way, we see that self-adjoint problems are great: they are less work since the \(w_k\)’s are the same as the \(y_k\)’s.

2.3.4 Connection with linear algebra

There are direct parallels between linear algebra and linear BVPs:

\[\begin{eqnarray*} \mbox{Linear algebra} & & \mbox{Linear BVP}\\ \mbox{$\underbrace{\vec{v}\cdot\vec{w} = \sum_{k-1}^n v_k w_k}_{\mbox{dot product}}$} & \longleftrightarrow & \mbox{$\underbrace{\langle f,g \rangle = \int_a^b f(x)\overline{g(x)}\mathrm{d}x}_{\mbox{inner product}}$}\\\\ \mbox{$\underbrace{\parallel \vec{v} \parallel^2 = \vec{v}\cdot\vec{v} \geqslant 0}_{\mbox{norm}}$} & \longleftrightarrow & \mbox{$\underbrace{\parallel f \parallel^2 = \langle f,f \rangle \geqslant 0}_{\mbox{norm}}$}\\\\ \mbox{$\perp$ vector $\vec{v}\cdot \vec{w} = 0$} & \longleftrightarrow & \mbox{orthogonal function $\langle f,g \rangle = 0$}\\\\ \mbox{Matrix $A$} & \longleftrightarrow & \mbox{Linear Differential Operator $L$} \end{eqnarray*}\]

Given a vector \(\vec{v}\), then the product \({A}\vec{v}\) is a new vector. Similarly, given a function \(y(x)\), evaluates to a new function on \(a \leqslant x \leqslant b\).

In linear algebra, a common problem is to solve the equation \[ A\vec{v}=\vec{b} \] for \(\vec{v}\), given matrix \(A\) and vector \(\vec{b}\). Compare that to our general task of solving \(Ly = f\) for \(y\), given operator \(L\) and RHS \(f\).

2.3.5 Eigenvalue problems

\[\begin{eqnarray*} \mbox{Linear algebra} & & \mbox{Linear BVP}\\ \mbox{$A\vec{v}=\lambda\vec{v}$} & \longleftrightarrow & \mbox{$Ly=\lambda y$}. \end{eqnarray*}\]

How many eigenvalues?

\[\begin{eqnarray*} \mbox{Linear algebra} & \quad & \mbox{Linear BVP}\\ \mbox{$A$ is $n\times n$} & \quad & \mbox{$L$ is order $n$}\\ \mbox{Solve $\vert A-\lambda I\vert=0$} & \quad & \\\\ \mbox{$\Rightarrow$ $n$ eigenvalues} &\quad & \mbox{$\infty$ eigenvalues} \end{eqnarray*}\]

Adjoint comparison:

\[ \begin{array}{cccc} &\mbox{Linear algebra} & & \mbox{Linear BVP}\\\\ &A \rightarrow A^T & & L \rightarrow L^*\\ && & \mbox{BC's} \rightarrow\mbox{BC$^*$'s}\\\\ \mbox{Self adjoint if} & A=A^T & & L=L^*,\;\; \mbox{BC=BC*} \end{array} \]

A self-adjoint matrix is called Hermitian. A self-adjoint linear differential operator is also referred to as Hermitian.

Try it yourself

Before we move onto a special class of problems, try solving questions 7 and 8 of the tutorial sheet.

We next look at a particular class of Hermitian operator – Sturm-Liouville operators – that occur quite commonly and have very useful properties.

2.4 Sturm-Liouville theory

Sturm-Liouville (SL) theory of second order concerns self-adjoint operators and a weighted eigenvalue problem \[ Ly = \lambda r(x) y \] where \(r(x) \geqslant 0\) is a weighting function, and the operator \(L\) is of the form \[ Ly=-\frac{\mathrm{d}}{\mathrm{d}x}\left(p(x) \frac{\mathrm{d}y}{\mathrm{d}x} \right) + q(x)y,\;\;a\leq x\leq b \] The functions \(p,q\), and \(r\) are all assumed to be real. It is easy to check that the operator is formally self-adjoint. It is fully self-adjoint if the boundary conditions take the separated form

\[\begin{eqnarray*} \alpha_1 y(a) +\alpha_2 y'(a) & = & 0\\ \alpha_3 y(b) +\alpha_4 y'(b) & = & 0. \end{eqnarray*}\]

Observe also that if \(p(a)=p(b)=0\), then \(\langle Ly,w\rangle=\langle y,Lw\rangle\) irrespective of boundary conditions. This defines the so-called natural interval \([a,b]\) for the problem.

The family of equations fitting this description involves some of the most important and applied mathematics and Physics. As we shall see they have some very nice properties which is why we focus on them in particular.

2.4.1 inhomogeneous SL problems

Since a SL operator is self-adjoint, the eigenfunction expansion process is quite straightforward. Consider \(Ly=f(x)\) with homogeneous BC’s. The system can be solved with an eigenfunction expansion in the same manner as in the previous section:

\[ \begin{split} Ly&=f(x)\\ \Rightarrow\langle Ly, y_k \rangle &= \langle f, y_k \rangle\\ \Rightarrow\langle y, Ly_k \rangle &= \langle f, y_k \rangle\quad\text{ (since } L^*=L, w_k=y_k)\\ \Rightarrow\langle y, \lambda_k r y_k \rangle &= \langle f, y_k \rangle\\ \Rightarrow\lambda_kc_k\langle y_k, r y_k \rangle &= \langle f, y_k \rangle. \end{split} \]

Thus we obtain the formula \[ c_k=\frac{\langle f, y_k \rangle}{\lambda_k\langle y_k, r y_k \rangle} \] and the full solution is given by \[y=\sum_{k} c_ky_k.\]

2.4.2 Transforming an operator to SL form

Many problems encountered in physical systems are Sturm-Liouville. In fact, though, any operator \[Ly\equiv a_2(x)y''(x)+a_1(x)y'(x)+a_0(x)y(x)\] with \(a_2(x)\neq0\) in the interval can be converted to a SL operator.

To transform to a self-adjoint SL operator, multiply by an integrating factor function \(\mu(x)\): \[\mu a_2(x) y''(x)+\mu a_1 y'(x)+\mu a_0 y\] We then choose \(\mu\) so that the first and second derivatives collapse, i.e. so it can be expressed in the form

\[-\frac{d}{dx}(p y')+ q y\]

Suppose we are considering the problem

\[Ly=f(x)\] where \(L\) is not Sturm-Liouville. We could solve following the more general approach (involving finding the adjoint functions \(w_k\)). Alternatively we could convert to Sturm-Liouville first, and then proceed using the nice properties of a self-adjoint operator. So, is the problem self-adjoint or isn’t it?? The key observation is that we are no longer solving the same problem. We have transformed to a new operator \[\hat{L}y=-\frac{d}{dx}(p y')+q y\] which does not satisfy the same equation as the original, that is \(Ly=f\) while \(\hat{L}y=\mu f\). They are both valid, and must ultimately lead to the same answer in the end.

Try it yourslef

Question 2 of tutorial sheet 2 tackles this topic (what does the altered operator look like). As for finding \(\mu,p,q\) in general, this is the topic of your second assignment.

2.4.3 Further properties of the Sturm-Liouville operator:

Orthogonality

Due to the presence of the weighting function, the relation is \[ \int_a^b y_k(x) y_j(x) r(x) \mathrm{d} x = 0. \]

Eigenvalues

The functions \(p,q,r\) are real, so \(\overline{L}=L\). Thus, taking the conjugate of both sides of \(Ly_k=\lambda_ky_k\) gives

\[ \begin{split} &L\;\overline{y_k} = \overline{\lambda_k}\; r\;\overline{y_k}\\ \Rightarrow\;\; &\left< y_k, L\,\overline{y_k}\right>=\overline{\lambda_k}\left< y_k,\,r\overline{y_k}\right>\\ &\text{but }\left< y_k, L\,\overline{y_k}\right>=\left< Ly_k,\;\overline{y_k}\right>=\lambda_k\left< ry_k,\;\overline{y_k}\right>=\lambda_k\left< y_k,\,r\overline{y_k}\right>\\ \;\;& \\ \quad\quad\Rightarrow\;\;&\overline{\lambda_k}=\lambda_k \end{split} \] Thus, all eigenvalues are real.

Moreover, if \(a\le x\le b\) is a finite domain, then \(\lambda\)’s are discrete and countable:

with \(\lim_{k\to\infty} \lambda_k =\infty\).

Eigenfunctions

The \(\{y_k\}\) are a complete set, that is all \(f(x)\) with \(\int f^2 r\,dx <\infty\) can be expanded as \[f(x)=\sum c_k y_k(x).\] Take an inner product with \(r(x) y_j(x)\): \[ \left< ry_j, f \right> = \left< ry_j, \sum c_k y_k \right> = \sum c_k \left< r y_j, y_k\right>\\ = c_j \left< r y_j, y_j \right> \] \[\Rightarrow c_j=\frac{\int_a^b f(x) y_j(x) r(x)\;dx}{\int_a^b y_j^2(x) r(x)\;dx}\] Note: I’ve used \(h(x)\) to make clear that we’re not talking about the solution to the BVP, rather we are expanding any function that is suitably bounded on the same domain.

This is what we wanted !

Again have a basis of orthogonal functions, which can represent any reasonable function. and which have a coefficient formula! This is exactly what we set out to find for our target P.D.E model.

2.4.4 Regular Sturm-Liouville Problems

Proposition 2.3 If the system satisfies all of the above and the additional conditions

  1. \(p(x) >0\) and \(r(x)>0\) on \(a\le x\le b\).
  2. \(q(x)\ge 0\) on \(a\le x\le b\).
  3. BCs have \(\alpha_1\alpha_2\le 0\) and \(\alpha_3\alpha_4\ge 0\), then all \(\boxed{\lambda_k\ge 0}\)

Proof. Using \(\left< y_k, Ly_k- \lambda_k r y_k\right>=0\), \[\begin{eqnarray*} -\int_a^b y (p y')' \,dx + \int_a^b y q y\,dx -\int_a^b y\lambda ry\,dx &=&0\\ -\int_a^b y (p y')' \,dx + \int_a^b qy^2\,dx -\lambda\int_a^b ry^2\,dx&=& 0\\ -pyy'\bigg|_a^b +\int_a^b p (y')^2 \,dx + \int_a^b qy^2\,dx -\lambda\int_a^b ry^2\,dx &=&0 \end{eqnarray*}\] \[ \lambda=\left[\int_a^b p (y')^2 \,dx + \int_a^b qy^2\,dx -pyy'\bigg|_a^b\right]\bigg/\int_a^b ry^2\,dx\ge 0\]

As a side note, the Rayleigh quotient, \(R[y]=\left< y,Ly\right> /\left< y,ry\right>\), is used extensively in analysis.

Back to our example model We saw in the first section of this chapter that the solution to the equation: \[ \ddy{y}{t} = L y + h(x,t). \] has the form: \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)y_n(x). \] with \[ c_n(t) = \mathrm{e}^{-\lambda_n t}\left(a_n +\int_{1}^{t}\mathrm{e}^{-\lambda_n t'}h_n(t')\rmd t'\right). \] If there is no forcing (a homogenous system) then: \[ c_n(t) = a_n\mathrm{e}^{-\lambda_n t}. \] It is in this time-varying behaviour we see the importance of knowing if our operator is in the Sturm-Liouville form or not. Lets assume \(L\) is in the Sturm-Liouville form, then there are two cases

  1. Assume all \(\lambda_n>0\), then we see that, if there is no forcing \(h(x,t)=0\) then all modes of the solution will tend exponentially to zero. Thus the model can only have growth if there is forcing
  2. If there are some \(\lambda_n=0\) and no forcing, then we have a mode of behaviour for which \(\dds{y_n}{t}=0\). Thus whilst all others modes (for which \(\lambda_n>0\)) tend to zero, this mode will remain. Its is the steady state or equlibrium behaviour of the system (in the of forcing). This is important, its telling us where the system will end up.

We have already seen an example of this in chapter 1 (actually in two dimensions there). If \[ L = \pder{}{x}{2}. \] and \(u(a)=u(b)=0\) then one can see the eigenfunctions are: \[ \lambda_n = \frac{n^2\pi^2}{(b-a)^2},\quad y_n = \sin\left(\frac{n\pi(x-a)}{b-a}\right). \] There is a zero eigenvalue, but it is trivially zero. So, in the absence of forcing, any initial behaviour will vanish.

But if we have Neumann boundary conditions \[ \ddy{u}{x}{\Large\vert}_{x=a}=0,\quad \ddy{u}{x}{\Large\vert}_{x=b}=0 \] then \[ \lambda_n = \frac{n^2\pi^2}{(b-a)^2},\quad y_n = \cos\left(\frac{n\pi(x-a)}{b-a}\right). \] and there is a non-trivial (although constant) state to which the system will tend.

It is perfectly possible for there to be multiple non-trivial \(\lambda_n=0\) for which the \(y_n\) are not trivia (or even constant), even with Strum Liouville operators.

Turing analysis

Those of you who take mathematical biology next year (and you really should!) will come to see that the constant (homogenous) behaviour in the Neumann case is fundamental to a result he derived. In short it says that, if this homogenous mode is suppressed from growing, it is very likely a regular repeating pattern will from.

2.5 Existence of the eigen-expansion via the Fredholm alternative.

So far we have shown that, if we can solve the eigen-equation \[ L y_n = \lambda_n y_n. \] and indeed the problem \[ L u = f, \] subject to some appropriate boundary conditions. Then we can proceed to construct that solution using orthogonality and the coefficient formulae. However, all along we have made an assumption that we can solve this equation, but that is not always the case.

So can we know in the first place if it can be solved? This is a hard question as it depends on the boundary conditions and sometimes one cannot determine easily if it has a solution by direct construction (actually solving the damn thing!). But, there is a tool which can answer it: the Fredholm alternative, named after Ivar Fredholm. Its one of the great theorems in linear operator theory (which is a vast subject area).

2.5.1 The alternative statement

Proposition 2.4 The Fredholm alternative holds that:

  1. Either, the homogenous adjoint problem \(L^* w =0\) has a non-trivial solution.
  2. Or, The inhomogeneous boundary value problem \(L y = f(x)\), with relevant boundary conditions, has a unique solution.

Either 1 or 2 is true, but never both simultaneously.

Proposition 2.5 If we are in the case 1 (the homogenous adjoint problem is non-trivial). Then there is a further alternative!

Either

  1. \[ <w,f> = 0 \tag{2.8}\] and there is a non-unique solution to the problem \(L y = f\), with relevant boundary conditions.
  2. Or, this integral does not vanish and there is no solution to the problem \(L y = f(x)\), with the relevant boundary conditions.

2.5.2 The “rough” explanation for why the alternatives

Consider the adjoint identity \[ \left<L u,w\right> = \left<u, L^{*} w\right> \] If \(L^{*}w=0\) then \(\left<L u,w\right>=0\). If \(w\) is non-trivial, i.e. branch 1. of our first alternative, this seems to imply \(L u=0\), in which case we could not solve \(L u = f(x)\) for non-trivial \(f(x)\). But of course if \(\left<L u,w\right>=\left<f(x),w\right> = 0\) for some non-trivial \(w\), then \(L u\) doesn’t have to be zero. This implies the condition Equation 2.8 which only constrains \(L\) up to a constant, due to linearity.

On the other hand if \(w\) were originally trivial then there is no constraint on \(L u\) as \(<L u,0>=0\) whatever \(L u\) is and no reason it cannot solve \(Lu = f(x)\).

That this condition, \(L^* w =0\) only if \(w\) is trivial, implies we can always solve \(L u =f\) and that the solution is unique is somewhat tougher to show (it relies on the relationship between the kernel of \(L^*\) and the range of \(L\)).

The rest of the proof is beyond the scope of this course.

2.5.3 An example

Consider the following inhomogeneous BVP: \[ \deriv{y}{x}{2} + g_0^2 y = \sin(x). \tag{2.9}\] with \(g_0\) constant, and with the boundary conditions: \[ y(0) , \quad y(\pi)=0. \tag{2.10}\]

Try it yourslef

Show this is a fully self-adjoint operator

Step 1 solve the adjoint problem We consider the the homogenous problem \(L^*w=0\): \[ \deriv{w}{x}{2} + g_0^2 w = 0,\quad w(0) , \quad w(\pi)=0. \] \(g_0>0\).The general solution is \[ w(x) = A\sin(g_0 x) + B\cos(g_0 x). \] The boundary condition \(w(0)=0\) implies \(B=0\). The second condition: \[ A \sin(g_0 \pi x)=0 \] requires either \(A=0\) (the trivial case) or \(g_0 = 0,1,2,\dots n\). So..

If \(g_0\) is not an integer we are in branch 2 of alternative statement 1. Thus there is a unique solution to Equation 2.9 with boundary conditions Equation 2.10. If \(g_0\) is an integer then there is (potentially) non-unique solution to this problem. That rests on an application of the second alternative.

Lets assume \(g_0\) is an integer. We now have to consider the alternative alternative! We calculate the integral \[ <w,\sin(x)> = \int_{0}^{\pi}\sin(g_0 x)\sin(x)\rmd x = \frac{\sin(g_0\pi)}{g_0^2-1}. \] So if \(g_0 \neq \pm 1\) then this is zero (remembering we are assuming \(g_0\) is an integer). Which would imply we can solve the problem. If \(g_0=1\) we can see the integral is not zero which implies we cannot solve the problem at all.

Lets test this:

To find the general solution to : \[ \deriv{y}{x}{2} + g_0^2 y = \sin(x). \] We first seek the complementary solution \(y_c\) which solves: \[ \deriv{y_c}{x}{2} + g_0^2 y_c = 0. \] We did this above for the adjoint \(w\) (as this is self adjoint). Thus \[ y_c(x) = A\sin(g_0 x)+ B\cos(g_0 x) \]

We don’t yet apply the boundary conditions, we need the particular solution first.

For the particular solution we seek a solution in the form \(y_p= a_1 \sin(x) + a_2\cos(x)\): which yeilds

\[ \left(a_1 \left(g_0^2-1\right)-1\right) \sin (x)+a_2 \left(g_0^2-1\right) \cos (x) =0 \tag{2.11}\]

If \(g_0\neq 1\) then we set \(a_2=0\) and \(a_1 = 1/(g_0^2-1)\) and we have a full solution to the problem: \[ y(x) = A\sin(g_0 x)+ B\cos(g_0 x) + \frac{1}{g_0^2-1}\sin(x) \]

Try it yourslef

Confirm this satisfies the boundary conditions if \(g_0\) is an integer (and not equal to \(1\) or \(-1\)), for any A. If \(g_0\) is not an integer, confirm you need \(A=B=0\), thus uniquely determining the problem.

Thus we have, as Fredholm told us to expect, a solution which is unique if \(g_0\) is not integer, and non-unique if it is integer, due to the constants \(A,B\) being free.

But what if \(g_0=1\)? It is clear Equation 2.11 cannot be satisfied. I hope, however, you remember that in such cases we should try a solution in the form: \[ y_p= a_1 x\sin(x) + a_2 x\cos(x), \] as \(\sin(x)\) the right hand side is an eigenfunction of the homogenous operator, an issue that arises due to the operator having a zero eigenvalue.

Substituting this in we find: \[ 2 a_1 \cos (x)-(2 a_2+1) \sin (x) \] (whether \(g_0=1,-1\)). So we need \(a_1=0\) and \(a_2=-1/2\), so our general solution is: \[ y(x) = A\sin(x)+ B\cos(x) - \frac{1}{2}x\cos(x) \tag{2.12}\]

But wait! doesn’t this contradict the Fredholm alternative, which told us in this case we cannot have any solution?

Try it yourslef

Confirm that Equation 2.12 cannot satisfy the boundary conditions \(y(0)=y(\pi)=0\). Thus is not a valid solution to the problem.

And you doubted Ivar Fredholm! Shame on you…..

At the risk of repeating myself the properties of an operator are not fully defined without their boundary conditions

2.5.4 The coefficient formula revisited.

We recall earlier we found the following general coefficient formula \[ \lambda_k c_k \langle y_k,w_k\rangle = \langle f,w_k\rangle. \] I asked you to chill on the matter of worrying about the \(\lambda_k=0\) case. Well, now we can address it with Fredholm in mind.

If \(\lambda_k=0\) then we need that \(\langle f,w_k\rangle=0\). If the adjoint problem is trivial, then this is true (and recall we solve the homogenous problem). If not we need \(\langle f,w_k\rangle=0\). These are two of the branches of the alternative! If \(\langle f,w_k\rangle\neq 0\) we can’t ultimately satisfy our equation (a conclusion from Fredholm).

2.6 Extensions on the model

2.6.1 The parabolic time-dependent model revisited.

We return back to our original target, the model: \[ \ddy{u}{t} = L u + h(x,t), \] and review how what we have seen relates to this problem.

We have been investigating how to solve equations in the form \(Ly + f=0\), the left hand side of which looks like the right hand side of the time-dependent model: an equation in the form: \[ Lu(x) + h(x)=0 \] This is an elliptic equation. As we have seen, it has solutions in the from: \[ u(x) = \sum_{n=0}^{\infty}c_n y_n(x). \] Where the \(y_n\) are eigenfunctions of the operator \(L\): \[ L y_n = \lambda y_n. \]

Extending to time dependence: \[ Lu(x,t) + h(x,t)=\ddy{u(x,t)}{t} \] turns it into a parabolic equation. The solutions are in the same form as the elliptic case, but now the coefficients vary in time. \[ u(x,t) =\sum_{n=0}^{\infty}c_n(t) y_n(x).\quad \] Note the \(y_n\) are the same eigenfunctions of \(L\).

We use use the same eigen-expansion method we used to solve the elliptic equation, for the parabolic equation. The difference is that in the case of the elliptic equation we solve an algebraic equation for the coefficients: \[ \lambda_n c_n \langle y_n,w_n\rangle = \langle f,w_n\rangle. \] In the parabolic case we have to solve an ordinary differential equation: \[ \ddt{c_n(t)} +\lambda_n c_n(t) - h_n(t) = 0 \] where the \(h_n\) ar calculated using the eigen-expansion: \[ h_n(t)\langle y_n,w_n\rangle = \langle h(x,t),w_n\rangle. \] Sorry, you can’t avoid the adjoint!

We stated this at the start of the chapter (how to solve the time-dependent problem), the point now is we know when and why it works.

2.6.2 How general is this?

The model \(Lu =f\) is a very general elliptic equation. The time-dependent extension we have considered so far is very simple, just a single time derivative. The good news is our eigen-expansion technique can work for much more general extensions.

There are two obvious ways we can extend this model. The first is to change the time operator, it requires little adjustment to what we have discussed. The second is to make \(L\) and this problem as a whole of a higher spatial dimension. That is more complex, although, as we shall see, there are some cases when it is actually just a simple extension of this eigen-expansion method.

We’ll start with the simpler case!

2.6.3 Extending the time dependence of the model.

Start with a simple example: \[ \pder{y}{t}{2} = L y + h(x,t). \] This requires an extra initial condition e.g. \[ u(x,0) = g_1(x),\quad \ddy{u}{t}{\Large\vert}_{t=0} = g_2(x). \] We follow the steps outlined in the initial section:

  1. Assume a series solution: \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)y_n(x),\quad L y_n=-\lambda_n y_n. \]
  2. Sub into the equation and decompose: \[ h(x,t) = \sum_{n=0}^{\infty}h_n(t)y_n(x). \]
  3. Solve the coefficient formula, the only place the recipie changes: \[ \deriv{c_n}{t}{2} +\lambda_n c_n - h_n = 0 \]

In this case the homogenous part \(c_{nh}\) takes the form: \[ c_{nh}(t) = a_{1n} \sin(\sqrt{\lambda_n}t) + a_{2n}\cos(\sqrt{\lambda_n}t). \] if \(\lambda>0\), with the constants \(a_{1n}\) and \(a_{2n}\) determined by the initial conditions. The inhomogeneous solution can also be written down, but we won’t focus on it here.

Hyperbolic vs Elliptic behaviour.

If we consider the operator \[ L = \pder{}{x}{2} \] then the first order (in time) and second order homogenous mode behaviour is respectively: \[ c_n(t) = a_{1n}\mathrm{e}^{-\lambda t}, \mbox{ vs }c_{n}(t) = a_{1n} \sin(\sqrt{\lambda_i}t) + a_{2n}\sin(\sqrt{\lambda_i}t). \] In this case \(\lambda\geq 0\) (as we saw in section 1). In the first order case we see higher order modes decay exponentially quickly, thus simplifying the solution. In the second order case they don’t decay, just oscillate in time. This is a manifestation of the tendency of hyperbolic equations to maintain the initial complexity of the system. By contrast the elliptic equation has a tendency to reduce it.

We see this in the videos above, on the left are solutions to the heat equation (first order in time), on the right solutions to the wave equation (second order in time), in 1d and 2d. Both start with the same initial conditions. We see on the left the solution gradually simplify as the small scale (larger \(n\)) behaviour decays faster leaving the \(n=1\) mode. In the wave equation case the complexity simply oscillates and moves at a finite speed (as discussed in chapter 1).

More generally we can consider a time-dependent linear operator \(L_t\), and we will have the following mode equation \[ L_t c_n+ \lambda c_n - h_n(t)=0. \] One can recast this in the form \[ L^+c_n = h_n (t), \] where \(L^+ = L-\lambda\).

This is just an inhomogeneous ordinary differential equation in the form \(L y = f\), the type we have been considering for most of the chapter! Just replace \(x\) with \(t\).

We can do it!! But there is one big difference. In this case it is an initial value problem (IVP), not a boundary value problem, as all derivatives are applied at \(t=0\). For this there is a much stronger theorem, the Picard-Lindeof theorem. It’s a little tricky to state here and even more fiddly to prove (the system needs transforming in general into a system of first order equations). It suffices to say, essentially, as long as the functions involved in \(L_t\) are Lifschitz smooth then the solution exists and is unique for all time.

Predicting the future..

If we try to impose a time condition at some \(t_1>0\) it becomes a boundary value problem in time. For example, for the operator \(L_t\), two conditions would suffice and we could impose: \[ u(x,0) = g_0(x),\quad u(x,t_1)=g_1(x). \] That is, we demand, after some time \(t_1\), the function \(u\) takes a very specific spatial profile \(g_1(x)\). Then we cannot guarantee the solution exists, and, in fact, we have to apply the Fredholm alternative to see if we can solve the problem

The mathematics of the problems \[ L u(x) = f(x),\quad L_t u(t) =f(t) \] are identical. The asymmetry in their solvability usually comes form the fact it is very natural to ask the question, how will a system change given from where it currently is: an initial value problem in time. Whilst spatially we ask the question about the behaviour of the function/density \(u\) on a finite spatial domain \(x\in[a,b]\), implying a boundary value problem.

2.6.4 Higher order spatial operators.

We consider the model \[ \ddy{u}{t} = L u + h(\vec{x},t), \] where \(u(\vec{x},t)\) depends on some \(m\)-dimensional vector \(\vec{x} \in\mathbb{R}^m\), and for which \(t>0\). The operator \(L :\mathbb{R}^m\rightarrow \mathbb{R}^m\) is an \(m\)-dimensional spatial operator.

The aim will be to assume some solution in the form: \[ u(\vec{x},t) = \sum_{\vec{n}\in \mathbb{N}^m} c_{\vec{n}}(t) y_{\vec{n}}(\vec{x}), \] where the \(y_n\) satisfy the eigen-equation \[ Ly_{\vec{n}}(\vec{x}) = -\lambda y_{\vec{n}}(\vec{x}). \] In essence we end up with the exact same recipe as we did in the one-dimensional case. The big difference is it is generally much harder to find the set of functions \(y_n\). But there are some very relevant special cases we can look at. Before they do, there is an important point to make.

Look at propositions 2.1 and 2.2. Note they never actually need the fact \(L\) is one dimensional and that: \[ \] Just linearity and the existence of a well defined inner product. They apply as well in \(n\)dimensions! In this case consider a domain \(\mathcal{D}\in \mathbb{R}^n\). We define the inner product for two smooth scalar functions \(u(\vec{x})\) and \(v(\vec{x})\) to be: \[ \left<u(\vec{x}),v(\vec{x})\right> = \int_{\mathcal{D}}u(\vec{x})\overline{v(\vec{x})}\mathrm{d} V \] where \(\mathrm{d}V\) is the volume element. This gives an adjoint definition: \[ \left<L u,v\right> = \left<u,L^*v\right> \] which applies to the propositions, and indeed works for the Fredholm Alternative.

Lets try a simple example of finding the adjoint.

We start with the left hand side \(\left<Lu,v\right>\) (for simplicity assume the functions are real). Remember the Laplacian \(\nabla^2\) is the divergence of a function’s gradient \(\nabla\cdot\nabla\): \[ \int_{\mathcal{D}}(\nabla\cdot \nabla u)v\,\mathrm{d} V \] Integration by parts in \(n\) dimensions just reverses the product rule for some scalar \(u\) and vector \(\vec{v}\): \[ \nabla\cdot(u\vec{v}) = \nabla u\cdot\vec{v} + u\nabla\cdot\vec{v}. \] We can apply this rule twice, noting the \(\vec{v}\) here will be the gradient of \(u\) (and \(v\)) and also the divergence theorem (twice) to obtain

\[ \int_{\mathcal{D}}(\nabla\cdot\nabla u) v\,\mathrm{d} S = \int_{\partial \mathcal{D}}v \nabla u\cdot\vec{n}\,\mathrm{d} V -\int_{\mathcal{D}}\nabla u\cdot\nabla v \mathrm{d} S = \int_{\partial \mathcal{D}}v \nabla u\cdot\vec{n}\,\mathrm{d} V - \int_{\partial \mathcal{D}}u \nabla v\cdot\vec{n}\mathrm{d} V + \int_{\mathcal{D}}u (\nabla\cdot\nabla v)\mathrm{d} V \] the last term is \(\left<u,L^*v\right>\) so \[ L^*v = \nabla\cdot\nabla v = \nabla^2v. \] So the operator is self adjoint. To make the boudnary terms vanish, we see if we have Dirichlet conditions \(u=0\) on \(\partial D\), then the second boundary term \[ \int_{\partial \mathcal{D}}u \nabla v\cdot\vec{n}\mathrm{d} S \] vanishes. Thus we need \[ \int_{\partial \mathcal{D}}v \nabla u\cdot\vec{n}\,\mathrm{d} S \] to vanish, hence \(v=0\) and the BC’s are self adjoint. A similar argument you should see gives self adjointness in the case of Neumann conditions \(\nabla u\cdot\vec{n}=0\).

where \(\vec{n}\) is the unit normal.

2.6.5 Solving n-d eigenvalue problems on box domains.

Consider our domain to be a box \([0,L_1]\times[0,L_2]\times \dots[0,L_m]\) with coordinates \((x_1,x_2,x_3, \dots x_m)\).

We consider the heat kernel in Cartesian coordinates: \[ L = \nabla^2 = \nabla \cdot \nabla \Rightarrow \nabla^2 u = \pder{u}{x_1}{2} +\pder{u}{x_2}{2} \dots\pder{u}{x_m}{2} \] We impose Dirichlet boundary conditions: \[\begin{align} & u(0,x_2,x_3\dots x_m)=u(L_1,x_2,x_3\dots x_m) = u(x_1,0,x_3\dots x_m)= u(x_1,L_2,x_3\dots x_m) \\ & \dots=u(x_1,x_2,x_3\dots 0) = u(x_1,x_2,x_3\dots L_m) =0. \end{align}\] We can see the solution to the problem: \[ \pder{u}{x_1}{2} +\pder{u}{x_2}{2} \dots\pder{u}{x_m}{2} = -\lambda_{\vec{n}} u \] is \[ y_n = C_{n_1,n_2,\dots n_m}\sin\left(\frac{n_1\pi x_1}{L_1}\right)\sin\left(\frac{n_2\pi x_2}{L_2}\right)\dots \sin\left(\frac{n_m\pi x_m}{L_m}\right) \] (we will cover how to derive this in problems class), with the \(n_i\) integers, and \[ \lambda_{\vec{n}} = \pi^2\left(\frac{n_1^2}{L_1^2}+\frac{n_2^2}{L_2^2}+\dots \frac{n_m^2}{L_m^2} \right), \] The total summed solution (via superposition) is: \[ y_n = \sum_{n_1=1}^{\infty} \sum_{n_2=1}^{\infty}\dots \sum_{n_m=1}^{\infty}C_{n_1,n_2,\dots n_m}\sin\left(\frac{n_1\pi x_1}{L_1}\right)\sin\left(\frac{n_2\pi x_2}{L_2}\right)\dots \sin\left(\frac{n_m\pi x_m}{L_m}\right) \]

Okay lets move to a polar domain, this would be a good model for a vibrating membrane.

2.6.6 Working in a disc:

Consider the Laplacian eigen-problem in polar coordinates \((r,\theta)\): \[ \nabla^2y_{\vec{n}} + \lambda_n y_{\vec{n}} = \frac{1}{r}\ddy{}{r}\left(r \ddy{y_n}{r}\right) + \frac{1}{r^2}\pder{y_n}{\theta}{2} + \lambda_n y_{\vec{n}} = 0. \] We will consider it subject to Dirichlet boundary conditions \(y_{\vec{n}} =0\) where \(\vec{n}\) is the unit normal (not the index of \(\lambda\)). The domain as a radius of \(L_r\), i.e. \(r\in[0,L_r]\) (and obviously \(\theta\in [0,2\pi)\) for periodicity).

We seek a solution via separation of variables: \(y_n(r,\theta) = R(r)P(\theta)\). Substituting this into our eigenequation we find, \[ \frac{1}{r}\ddy{}{r}\left(r \ddy{(RP)}{r}\right) + \frac{1}{r^2}\pder{(RP)}{\theta}{2} + \lambda_{\vec{n}} RP = P\frac{1}{r}\dds{}{r}\left(r \dds{R}{r}\right) + R\frac{1}{r^2}\deriv{P}{\theta}{2} + \lambda_{\vec{n}} RP= 0. \] Multiplying both sides of this equation by \(r^2/RP\) we have, \[ \frac{r}{R}\ddy{}{r}\left(r \dds{R}{r}\right) + \frac{1}{P}\pder{P}{\theta}{2} + \lambda_{\vec{n}} r^2 = 0 \implies \frac{r}{R}\dds{}{r}\left(r \dds{R}{r}\right) + \lambda_{\vec{n}} r^2 = - \frac{1}{P}\deriv{P}{\theta}{2} = C_\theta, \] where we have separated the \(r\) and \(\theta\) functions, and hence must have that both sides of this equation are constant, which we set to be equal to \(C_\theta\).

\[\begin{align} \frac{1}{Rr}\dds{}{r}\left(r \dds{R}{r}\right) & = \lambda_{\vec{n}}-\frac{C_\theta}{r^2},\\ - \frac{1}{P}\deriv{P}{\theta}{2} = C_\theta, \end{align}\] and the boundary conditions, \[ R(L_r)=0, \quad P(0) = P(2\pi), \quad \dds{P}{\theta}(0) = \dds{P}{\theta}(2\pi). \] The second come from the necessarily periodic nature of the function in the \(\theta\) coordinate (we are assuming some regularity).

This now is just two simple problems.

The \(P\) modes

For the \(P\) equation I think it should be clear that we can only have \(\cos\) and \(\sin\) solutions (not the exponential solutions which would require negative \(C_{\theta}\)) and cannot satisfy the periodic boundary conditions. We then write our solutions as, \[ P(\theta) = A\cos\left(\sqrt{C_\theta} \theta\right) + B \sin\left(\sqrt{C_\theta} \theta\right). \] Using the first boundary condition we have: \[ A\cos(0) + B \sin(0) = A = A\cos\left(2\sqrt{C_\theta} \pi\right) + B \sin\left(2\sqrt{C_\theta} \pi\right), \] where we need the \(\cos\) term to equal \(1\) and the \(B\sin\) term to go to \(0\). For the former we require \(2\sqrt{C_\theta} \pi=2 n \pi,\) so \(\sqrt{C_\theta}\) must be an integer, that is \(\sqrt{C_\theta} = n\), \(n=0,1,2,3,\dots\).

The second boundary condition reads, \[ -A\sqrt{C_\theta}\sin(0) + B \sqrt{C_\theta}\cos(0) = B\sqrt{C_\theta} = A\sqrt{C_\theta}\sin\left(2\sqrt{C_\theta} \pi\right) + B \sqrt{C_\theta}\cos\left(2\sqrt{C_\theta} \pi\right). \] As before, this requires that \(\sqrt{C_\theta} = n\), an integer. So the solution for \(P\) is, \[ P(\theta) = A\cos(n \theta) + B \sin(n \theta),\quad n=0,1,2,\dots, \] where we note that \(C_\theta = n^2\). The inclusion of \(n=0\) covers the case that \(C_\theta=0\), which has the constant solution \(P =A\).

The \(R\) modes The equation for \(R\) is given by, after expanding the derivative terms and multiplying by \(r^2\), \[ r^2\deriv{R}{r}{2} + r \dds{R}{r} +( \lambda_{\vec{n}} r^2-n^2)R = 0, \] where the \(n^2\) comes from the fact the P boundary conditions ensured that \(C_\theta=n^2\) and hence represents a coupling of the two boundary-value-problems.

Its only Bessel’s equation!

I told you in the first section the Bessel equation has series solutions.

Finally, we do not yet have the values for \(\lambda_n\). These can be obtained by using the \(R\) boundary condition, \[ R_n(L_r)=0 \implies J_n(\sqrt{\lambda_{\vec{n}}}L_r) = 0. \] We see that, for each \(n\) coming from the \(\theta\) part of the problem, the Bessel function will oscillate around zero. In fact it will have infinitely many zeroes, a property of the Bessel functions, so that for each fixed \(n\), there will be infinitely many values of \(\lambda_{\vec{n}}\). As these values may be different for each \(n\), we modify the subscript of our spatial eigenvalue to include this dependence and write \(\lambda_{k,n}\). Hence the solution for \(R_n\) in full generality can be written as, \[ R_n(r) = \sum_{k=1}^\infty A_{k,n}J_n\left(\sqrt{\lambda_{k,n}}r\right). \] We have chosen to index these zeroes starting at \(k=1\), but this choice is somewhat arbitrary, and just indicates that this first zero of \(J_n(x)\) is not in general \(x=0\) itself (though it will be for \(n>0\)). These values of \(\lambda_{k,n}\) do not have closed-form expressions, and must be solved for numerically or, in special cases such as \(k \to \infty\), using analytic approximations. We will not pursue this for this course.

Putting it all together

Finally we can write our solution to the eigenvalue problem as \[ y_n(r,\theta) = \sum_{n=0}^\infty \sum_{k=1}^\infty J_n\left(\sqrt{\lambda_{n,n}}r\right)\left(A_{k,n}\cos(n\theta) + B_{k,n}\sin(n\theta) \right), \]

Decomposition into one-dimensional problems

The essence of this solution was the decomposition of a PDE into a set of ordinary differential equations for \(R\) and \(P\). Then we are dealing with the standard theory we have developed in this chapter.

2.6.7 More?

We’ll work through some more examples in the problems class. This example should give you the message that it all gets very tedious and unwieldy in higher dimensions.

We have so far dealt with a box and a disc. The message, roughly, is that simple shapes are doable more complex shapes are much harder, they don’t generally separate out into a system of one dimensional problems, and in those cases a different approach, Green’s method is sometime s more productive.

Handily thats the topic of chapter 3, where I will talk more about the issues of arbitrarily shaped domains!