$$ \def\ab{\boldsymbol{a}} \def\bb{\boldsymbol{b}} \def\cb{\boldsymbol{c}} \def\db{\boldsymbol{d}} \def\eb{\boldsymbol{e}} \def\fb{\boldsymbol{f}} \def\gb{\boldsymbol{g}} \def\hb{\boldsymbol{h}} \def\kb{\boldsymbol{k}} \def\nb{\boldsymbol{n}} \def\tb{\boldsymbol{t}} \def\ub{\boldsymbol{u}} \def\vb{\boldsymbol{v}} \def\xb{\boldsymbol{x}} \def\yb{\boldsymbol{y}} \def\Ab{\boldsymbol{A}} \def\Bb{\boldsymbol{B}} \def\Cb{\boldsymbol{C}} \def\Eb{\boldsymbol{E}} \def\Fb{\boldsymbol{F}} \def\Jb{\boldsymbol{J}} \def\Lb{\boldsymbol{L}} \def\Rb{\boldsymbol{R}} \def\Ub{\boldsymbol{U}} \def\xib{\boldsymbol{\xi}} \def\evx{\boldsymbol{e}_x} \def\evy{\boldsymbol{e}_y} \def\evz{\boldsymbol{e}_z} \def\evr{\boldsymbol{e}_r} \def\evt{\boldsymbol{e}_\theta} \def\evp{\boldsymbol{e}_r} \def\evf{\boldsymbol{e}_\phi} \def\evb{\boldsymbol{e}_\parallel} \def\omb{\boldsymbol{\omega}} \def\dA{\;d\Ab} \def\dS{\;d\boldsymbol{S}} \def\dV{\;dV} \def\dl{\mathrm{d}\boldsymbol{l}} \def\rmd{\mathrm{d}} \def\bfzero{\boldsymbol{0}} \def\Rey{\mathrm{Re}} \def\Real{\mathbb{R}} \def\grad{\boldsymbol\nabla} \newcommand{\dds}[2]{\frac{d{#1}}{d{#2}}} \newcommand{\ddy}[2]{\frac{\partial{#1}}{\partial{#2}}} \newcommand{\pder}[3]{\frac{\partial^{#3}{#1}}{\partial{#2}^{#3}}} \newcommand{\deriv}[3]{\frac{d^{#3}{#1}}{d{#2}^{#3}}} \newcommand{\ddt}[1]{\frac{d{#1}}{dt}} \newcommand{\DDt}[1]{\frac{\mathrm{D}{#1}}{\mathrm{D}t}} \newcommand{\been}{\begin{enumerate}} \newcommand{\enen}{\end{enumerate}}\newcommand{\beit}{\begin{itemize}} \newcommand{\enit}{\end{itemize}} \newcommand{\nibf}[1]{\noindent{\bf#1}} \def\bra{\langle} \def\ket{\rangle} \renewcommand{\S}{{\cal S}} \newcommand{\wo}{w_0} \newcommand{\wid}{\hat{w}} \newcommand{\taus}{\tau_*} \newcommand{\woc}{\wo^{(c)}} \newcommand{\dl}{\mbox{$\Delta L$}} \newcommand{\upd}{\mathrm{d}} \newcommand{\dL}{\mbox{$\Delta L$}} \newcommand{\rs}{\rho_s} $$
To start our more general investigation into the method of eigenfunction expansions as solutions to PDE’s, we will define a prototype problem, which has many of the properties we seek to elaborate.
2.1 A prototype problem and its solution via a function expansion
We consider a class of highly prevalent P.D.E models used in various areas of applied mathematics. These have the following form: \[ \ddy{u}{t} = L u + h(x,t). \tag{2.1}\] Here we consider a one-dimensional model of a scalar function \(u(x,t)\) on a domain \(x \in[a,b]\) and \(t>0\). There are numerous choices to be made in specifying this model:
2.1.1 The operator
The operator \(L\) is a linear operator, satisfying \[ L[a\,u_1 + b\,u_2] = a\,Lu_1 +b\,Lu_2. \] If it has derivatives they are only in the spatial variable \(x\). It can represent the spreading, transport and creation/destruction of the scalar \(u\).
Examples The elliptic heat kernel: \[ L = \pder{}{x}{2} \Rightarrow \ddy{u}{t} = \pder{u}{x}{2} + h(x,t) \] or \[ L = \pder{}{x}{2} + v\ddy{}{x} + \lambda \Rightarrow \ddy{u}{t} = \pder{u}{x}{2} + v\ddy{u}{x} +\lambda u + h(x,t) \] which is a model with a density \(u\) diffusing, maybe the density of a dissolved drug. The term \(v\ddy{}{x}\) could represent a fluid flow, maybe blood moving the drug density at a speed \(v\). Finally the term \(\lambda u\) could represent the creation degradation/creation of the drug at a rate \(\lambda\) (maybe enzyme action).
We can have much more elaborate \(L\), just we need that it is linear for what follows.
2.1.2 The “source” term \(h(x,t)\)
The function \(h(x,t)\), sometimes referred to as a forcing term, does not depend on the value of \(u\), so provides some external (to the system) change in the rate of production/loss of \(u\) (which is what \(\ddy{u}{t}\)) represents).
If \(u\) were representing a population it could represent immigration/migration, or in the diffusing drug example discussed above it could be an intravenous flow.
An example \[ h(x,t) = \sin\left(\pi\frac{x-a}{b-a}\right)\cos(\omega t). \] Provides a forcing which is zero at the boundaries, and maximal in the middle of the domain. It varies from increasing \(u\) when \(\cos(\omega t)>0\), to decreasing \(u\) when \(\cos(\omega t)<0\), a change which oscillates at a rate \(\omega\).
If \(h(x,t)=0\) then the problem is referred to as homogenous. If it is non-zero the problem is referred to as inhomogeneous.
For every inhomogeneous problem, there is an associated homogenous problem obtained by setting \(h(x,t)=0\). If we label the respective solutions \(u_{ih}\) and \(u_h\) then \(u_h + u_{ih}\) is also a solution to the inhomogeneous problem (from linearity).
2.1.3 Boundary conditions
To make our problem concrete we need to supply \(n\) boundary conditions, with the \(n\) matching the spatial order of the operator i.e. we would need two boundary conditions in the examples above. We can write these as follows: \[ BC_i[u,a,b]=0. \] where the \(BC_i\) are linear operators of the functions evaluated at \(x=a,b\).
Examples: \[ BC_1[u,a,b] = \ddy{u}{x}{\Large\vert}_{x=a} = 0,\quad BC_2[u,a,b] =\ddy{u}{x}{\Large\vert}_{x=b} = 0 \] But we could have more elaborate (but linear) choices: \[ BC_1[u,a,b] = \pder{y}{x}{2}{\Large\vert}_{x=a}+ 5u(b,t) = 0,\quad BC_2[u,a,b] = u(a,t)+u(b,t) - 10 = 0 \]
Boundary conditions are referred to a homogenous if the dependent variable \(u\), or its derivatives, are zero on the boundary (the first case above). Otherwise they are inhomogeneous (the second case).
Hereafter we will just write \(BC_i u\) to indicate the boundary operator \(BC_i\) acting on \(u\), and assume it is understood it is evaluated at the boundaries.
2.1.4 Initial conditions
Finally, this problem needs a start point, the initial condition. That is some value of the function \(u\) at \(t=0\): \[ u(x,0) = g(x). \]
2.1.5 Examples which aren’t the heat kernel
Just a few amongst many:**
This section is not examinable, its just for fun and to give you some insight into how this model is used in practice.
The Bessel equation, for which:
\[ L = x^2 \pder{}{x}{2} + x\ddy{}{x}+(x^2-\nu^2) \] with \(\nu\) some real constant (often an integer but not always). This arises for systems represented in polar or cylindrical coordinates and represents the radial behaviour of those systems, as we saw in the previous chapter. It has been used famously for vibrating membranes (a polar coordinate system) and also the reverse pinch magnetic field configuration common in fusion plasma devices, a cylindrical model (see here for a nice introduction):
The Chebyshev operator \[ L = (1 - x^2) \pder{}{x}{2} - x \ddy{}{x} + n^2 = 0 \] This operator is associated (as we will see shortly) with the Chebyshev polynomials, a family of orthogonal polynomials widely used in various fields such as numerical analysis, physics, engineering, and machine learning. They are crucial for polynomial approximation, providing optimal solutions by minimizing the maximum error (Chebyshev approximation), and are used in Chebyshev interpolation to reduce errors in data fitting. In differential equations, Chebyshev polynomials enable efficient spectral methods for solving PDEs, while in control theory and signal processing, they contribute to filter design and optimization. They also play a role in root-finding methods, numerical integration, and quantum mechanics, due to their ability to approximate complex functions with high precision.
The surface flux transport model
The following equation:
\[ \ddy{u}{t} = D\nabla^2 u -\nabla\cdot\left({\bf v}u\right) +h({\bf x},t), \] is used to model the line of sign component of the Sun’s magnetic field at its surface (as a density \(u\)). This is technically a 2-D variant of our model but it fits the basic structure and we will see how to tackle 2-D variants later in the chapter, the method is near identical to the 1-D case. A video of a simulation (performed by your first term lecturer!) of the field is shown in the video below.
The field diffuses due to convective motion at the Sun’s surface. The velocity \({\bf v}\) represents various large scale motions in the solar interior (differential rotation and meridonal circulation) moving the field around systematically. Note how it leads to the stretching and shearing of the fields. Finally the source term \(h\) is the key to this model. It represents the emergence of new strong localised magnetic field, “active regions” which can lead to solar storms. You hopefully all saw the recent aurorae? They were caused by a monster active region emerging and firing giant balls of plasma towards the earth! The synoptic map in the video represents these observations which are injected into the simulation as they are observed.
The linear reaction diffusion system a coupled system of our model in the form: \[\begin{align} \ddy{u}{t} &= D_1\nabla^2 u + f_1 u + f_2 v,\\ \ddy{v}{t} &= D_2\nabla^2 v + g_1 u + g_2 v, \end{align}\] with \(D_1,D_2,f_1,f_2,g_1,g_2\) constants. This results as a linearisation (assume \(u\) and \(v\) are small) of the non-linear system: \[\begin{align} \ddy{u}{t} &= \nabla^2 u + f(u,v),\\ \ddy{v}{t} &= \nabla^2 v + g(u,v), \end{align}\] where \(f,g\) are non linear “reaction” functions (they’re not source terms like \(h\) as they depend on \(u,v\)). This model was used by Alan Turing to provide an explanation for formation of ordered patterns in nature, e.g. spots and stripes on animals, bacterial colonies and regular neuronal activity. The visual PDE story page about his seminal paper has more information: “on the chemical basis of morphogenesis”.
In this paper, the linear system is used (surprisingly successfully) to predict the number of spots/stripes (colonies) which form in the far more complex non-linear system. A fun example is shown in the video above where the original paper’s front page is give as an initial condition and a spotted pattern develops from it, using an example of Turing’s own system!
2.1.6 A solution
Based on the examples we encountered in the previous chapter, We will jump a number of steps ahead to see why solving this system can be translated into an eigenvalue problem. To get to this answer we will make a good number of unjustified assumptions, the aim of the rest of the chapter is to justify them.
We start by boldly assuming that there is some countably infinite set of functions \(y_n\) (a basis) such that the function \(u(x,t)\) can be written as \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)y_n(x). \]
We saw in the last chapter separation of variables would lead to this kind of infinite sum solution. We are just cutting to the chase.
Further we assume it is possible to decompose the function \(h(x,t)\) using this basis: \[ h(x,t) = \sum_{n=0}^{\infty}h_n(t)y_n(x). \]
We also saw in the last chapter we can use series like the Fourier series and the Bessel-Fourier series to represent arbitrary functions, we are again assuming this here.
If we substitute this into our model we obtain
\[ \sum_{n=0}^{\infty}\ddt{c_n(t)}y_n(x) = \sum_{n=0}^{\infty}c_n(t)L y_n(x) +\sum_{n=0}^{\infty}h_n(t)y_n(x) \] (as \(L\) is a spatial variable operator only). Now we make some further assumptions, first that the functions \(y_n(x)\) satisfy the following eigen-equation: \[ Ly_n(x)=-\lambda_n y_n(x), \] with \(\lambda_n\) a constant. If we assume this then we can bring our sums together into one super-sum:
We saw in the last chapter separation of variables would lead to this kind of problem.
\[ \sum_{n=0}^{\infty}\left[\ddt{c_n(t)} +\lambda_n c_n(t) - h_n(t)\right]y_n(x) = 0 \] For our next assumption we assume the \(y_n(x)\) are linearly independent, such that this sum can only be zero if each individual term of the sum is zero. That means we have to solve the following ordinary differential equations for each \(n\): \[ \ddt{c_n(t)} +\lambda_n c_n(t) - h_n(t) = 0 \] Remember, the only unknowns here are the functions \(c_n(t)\). If we can solve this ordinary differential equation for the \(c_n(t)\) then we have our solution to the original partial differential equation!: \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)y_n(x). \] The good news is that we can, its just a first order O.D.E which can be solved by the integrating factor method. The solution is \[ c_n(t) = \mathrm{e}^{-\lambda_n t}\left( c_n(0) +\int_{0}^{t}\mathrm{e}^{\lambda_n t'}\,h_n(t')\,dt' \right). \tag{2.2}\]
The homogenous solution in Equation 2.2 is: \[ a_n\mathrm{e}^{-\lambda_n t} \] The inhomogenous solution is: \[ \mathrm{e}^{-\lambda_n t}\int_{1}^{t}\mathrm{e}^{\lambda_n t'}h_n(t')\rmd t' \]
As in the last chapter, the constants of integration \(a_n\) are determined from the initial condition: \[ u(x,0) = \sum_{n=0}^{\infty}c_n(0)y_n(x) = g(x), \] again assuming we can write a function as a sum over the functions \(y_n\) which puts values to the constants \(c_n(0)\). But what about the boundary conditions? Well, from linearity and the fact \(L\) is a spatial operator we have, \[ BC_i u =\sum_{n=0}^{\infty}c_n(t)BC_i y_n = 0, \] so the functions \(y_n\) carry the boundary conditions: \[ BC_i y_n =0. \] This is where the linearity is important. Note: we have again assumed linear independence of the functions \(y_n\).
Let’s summarise this
We have proposed that the solution to the problem \[ \ddy{u}{t} = L u + h(x,t), \] with \(n\) boundary conditions \[ BC_{i}[u]=0, \] takes the form: \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)y_n(x), \] where, the \(y_n\) satisfy the eigen-equation: \[ L y_n =-\lambda_n y_n, \] and each \(y_n\) satisfy the \(n\) boundary conditions: \[ BC_i y_n=0. \] Finally, the temporal functions \(c_n(t)\) take the form \[ c_n(t) = \mathrm{e}^{-\lambda_n t}\left( c_n(0) +\int_{0}^{t}\mathrm{e}^{\lambda_n t'}\,h_n(t')\,dt' \right), \] where \[ h(x,t) = \sum_{n=0}^{\infty}h_n(t)y_n(x). \]
2.1.7 A concrete example via the Fourier series
This has become quite abstract. I keep making assumptions about the functions \(y_n\), their linear independence, and their seeming ability to represent fairly arbitrary functions. Let’s re-visit an example we saw in the last chapter, in this context to get a handle on it.
If we choose the operators \[ L = \pder{}{x}{2},\quad BC_1[u] = \ddy{u}{x}{\Large\vert}_{x=a}\quad = BC_2[u] = \ddy{u}{x}{\Large\vert}_{x=b}=0, \] then he solution requires we solve the eigen-problem: \[ \deriv{y_n}{x}{2} = -\lambda_n y_n,\quad \dds{y_n}{x}{\Large\vert}_{x=a} =0,\quad \dds{y_n}{x}{\Large\vert}_{x=b}. \] Its solution is easily seen to be \[ y_n = a_n \cos\left(\frac{n\pi(x-a)}{b-a}\right),\quad \lambda_n = \frac{n^2\pi^2}{(b-a)^2} \] for arbitrary integers \(n\).
The argument in the previous section states that the proposed solution to the P.D.E \[ \ddy{u}{t} = \deriv{u}{x}{2} + h(x,t). \] takes the from \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)\cos\left(\frac{n\pi(x-a)}{b-a}\right). \]
This is just a Fourier series. We know we can represent any reasonable forcing function \(h(x,t)\) using the standard formula \[ h(x,t) = \sum_{n=0}^{\infty}h_n(t) \cos\left(\frac{n\pi(x-a)}{b-a}\right),\quad h_n(t) = \frac{l}{b-a}\int_{a}^{b}h(x,t)\cos\left(\frac{n\pi(x-a)}{b-a}\right)\rmd{x}. \] The same is true of the initial condition: \[ g(x) = \sum_{n=0}^{\infty}c_n(0)\cos\left(\frac{n\pi(x-a)}{b-a}\right),\quad c_n(0) = \frac{l}{b-a}\int_{a}^{b}g(x)\cos\left(\frac{n\pi(x-a)}{b-a}\right)\rmd{x}, \] where \(l=2\) if \(n>0\) or \(l=1\) if \(n=0\). These expansions give us the values required to set the \(a_n\) in Equation 2.2.
The boundary conditions in the problem at hand meant we couldn’t have the \(\sin\) parts of the Fourier series. But that places a restriction on \(g(x)\). We have proposed it can be represented as a sum of cosines, but that means that it must satisfy the boundary conditions itself: \(\dds{g}{x} = 0\) on \(x=0,L\).
This is not a massive problem and very often it make physical sense for the forcing to satisfy the same boundary conditions as \(u\), after all it is trying to force the behaviour of \(u\).
So we can now set all our constants in the equation and we have a complete form for \(c_n(t)\).
A specific forcing For example if we make our lives simple and assume \(a=0,b=1\) and: \[ h(x,t) = \sin(\omega t)\cos\left(3\pi x\right),\quad g(x) = x-1/2. \] Then all \(h_n(t)=0\) except \(h_3(t) =\sin(a t)\) and: \[ c_n(0) = 2\int_{0}^{1}(x-1/2)\cos\left(n\pi x\right)= \frac{2(\cos(n\pi)-1)}{n^2\pi^2}d x. \] for \(n>0\) and \(c_0(0)=0\).
Thus if \(n\neq 3\) we have \[ c_n(t) = c_n(0)\mathrm{e}^{-n^2\pi^2 t}, \] where the \(c_n(0)\) are as given above, and hence all the modes decay exponentially. In the special case \(n=3\) \[\begin{align} c_3(t) &= \mathrm{e}^{-\lambda_3 t}\left(a_n +\int_{1}^{t}\mathrm{e}^{\lambda_3 t'}\sin(\omega t')\rmd t'\right),\\ &= \mathrm{e}^{-\lambda_3 t}\left(a_n +\left[\frac{e^{\lambda_3 t} (\lambda_3 \sin (\omega t)-\omega \cos (\omega t))}{\omega^2+\lambda_n^2}\right]_1^t\right),\\ &=\ \mathrm{e}^{-\lambda_3 t}\left(a_n +\frac{e^{\lambda_3 } (\omega \cos (\omega)-\lambda_3 \sin (\omega))+e^{\lambda_3 t} \left(\lambda_3 \sin \left(\omega t\right)-\omega \cos \left(\omega t\right)\right)}{\omega^2+\lambda_3^2}\right) \end{align}\] with \(a_n\) set by the value of \(c_n(0)\). Which looks a little intimidating! The crucial point here is the homogenous part (featuring the \(a_n\)) decays away exponentially. Whilst the inhomogeneous part does not. When \(t\) is large we have \[ c_3(t)\approx \frac{(\omega \cos (\omega)-\lambda_3 \sin (\omega))+ \left(\lambda_3 \sin \left(\omega t\right)-\omega \cos \left(\omega t\right)\right)}{\omega^2+\lambda_3^2}. \] Implying the forcing term takes over and causes the \[ \cos\left(3\pi x\right) \] mode to have an sinusoidally varying amplitude (of rate \(\omega\))
Note the initial condition has essentially vanished from the solution as the forcing takes over.
This particular example, which showed our proposed scheme can work, relied on two facts: 1. The orthogonality of the Fourier series, which ensures linear independence of the series representation. 2. That for any function \(f(x)\) we have a formula for the coefficients of the Fourier series: \[ \frac{l}{b-a}\int_{a}^{b}f(x)\cos\left(\frac{n\pi(x-a)}{b-a}\right)\rmd{x}, \] or \[ \frac{l}{b-a}\int_{a}^{b}f(x)\sin\left(\frac{m\pi(x-a)}{b-a}\right)\rmd{x}. \] The aim of what follows is to show when we can obtain the same two properties for the solutions to \[ L y_n = \lambda_n y_n \] for more general linear \(L\).
Its time to try this out for yourselves.
Problem sheet 2 questions 1-3 we will walk you through some similar examples. We will consider them in both the tutorials and class.
2.2 Analysis of the eigenfunction problem.
So, we now focus on the eigenfunction problem: \[ Ly_i(x)=\lambda_i r(x) y_i(x), \] along with some specified boundary conditions on a doamin \(x\in[a,b]\) and \(r(x)>0\) is some positive definite weigthing function on the specified domain.
The absence of the minus sign, compared to the form of eigenfunction equation used above, is just a matter of convenience. Here it will be more convenient notationally to have it be positive. None of the conclusions we draw are affected either way.
Here \(y_i\) is an eigenfunction with corresponding eigenvalue \(\lambda_i\). As mentioned in the previous chapter, this is analogous to the linear algebra eigenproblem
\[ \mathbf{A}\vec{x}_i=\lambda_i\vec{x}_i \] where \(\mathbf{A}\) is a matrix and \(\vec{x}_i\) an eigenvector with eigenvalue \(\lambda_i\), and, as we shall see this is reflected in the terminology and concepts we use to explore the properties of the eigenfunction equation.
2.2.1 Function spaces
In the same way as linear algebra utilises vector spaces, with linear differential operators we shall think of function spaces. Consider the infinite dimensional space of all reasonably well-behaved functions on the interval \(a\leq x\leq b\).
We recall that for vectors we used the dot product (an inner product) which allowed us to say when vectors were orthogonal and hence independent. For functions we will use an integral version of this:
Definition 2.1 The inner product of a pair of functions \(u(x),v(x)\) defined which are suitably well behaved on the domain \(x \in [a,b]\) to be the following integral product: \[ \langle u, v\rangle=\int_a^bu(x)\overline{v(x)}\;dx. \] Here the overbar denotes complex conjugate.
This is the function equivalent of the vector product, for vectors \(\vec{u}\), \(\vec{v}\) we have \[ \langle \vec{u}, \vec{v}\rangle = \vec{u}\cdot\vec{v} \] The analogy is that the integral is a formal infinite sum and the function is space is like a vector space whose dimension is infinite.
In this course, we will only be concerned with real-valued functions. Thus we may drop the overbar for simplicity.
Definition 2.2 A pair of functions \(u(x),v(x)\) are orthogonal on \(x \in [a,b]\) with weight \(r(x)\) if their weighted inner product is zero: \[ \bigg\langle r(x)u,v\bigg\rangle=\bigg\langle u, r(x)v\bigg\rangle=\int_a^b r(x) u(x)v(x)\,dx =0. \]
which of course is the equivalent of vector orthogonality \(\langle \vec{u}, \vec{v}\rangle = \vec{u}\cdot\vec{v}=0\).
The Fourier modes \(\sin\left(\frac{n\pi}{b-a}x\right)\) and \(\sin\left(\frac{n\pi}{b-a}x\right)\) are orthogonal with weight \(r(x)=1\) if \(n\neq m\) i.e.: \[ \left<\sin\left(\frac{n\pi}{b-a}x\right),\sin\left(\frac{m\pi}{b-a}x\right) \right>=0, \quad \mbox{ if } n\neq m. \]
We saw in the previous chapter the Bessel functions also had an orthogonality: \[ \int_0^{L_r} x\, J_n\!\left(\frac{j_{n,k}x}{L_r}\right) J_n\!\left(\frac{j_{n,m}x}{L_r}\right)\,dx = \frac{L_r^2}{2}\,\big[J_{n+1}(j_{n,k})\big]^2\,\delta_{km}. \] here the weighting function is \(r(x) =x\): \[ \left< J_n\!\left(\frac{j_{n,k}x}{L_r}\right), x J_n\!\left(\frac{j_{m,k}x}{L_r}\right)\right> =\frac{L_r^2}{2}\,\big[J_{n+1}(j_{n,k})\big]^2\,\delta_{km}. \] A common notation used in the literature is to have a weighted norm \(\langle u,r(x) v\rangle =\langle u, v\rangle_{r}\), but I found last year this got a bit confusing so we will write the weighting explicitly in the inner product where one is used (\(\langle u,r(x) v\rangle\) not \(\langle u,v\rangle_{r}\)).
It is important that this weighting is a positive definite function on the domain interior. This ensures that the inner product of a function with itself is positive definite so we can define the notion of the “length” of a function: \[ <u,r(x)u> = ||u||^2>0, \mbox{unless } u=0. \] This is the weighted norm of the function \(u\).
We saw in the last chapter that a common source of such weightings is due to the domain geometry, (in fact the \(h_i\)) factors you encountered in the first half of the course.
Questions 4 and 5 of problem sheet 2 concern function orthogonality.
2.2.2 Adjoint operators
Crucial to almost everything that follows is the notion of the adjoint of an operator. I always find it seems a bit arbitrary when first written down, so first and analogy to the adjoint matrix in the finite dimensional case.
The adjoint matrix For a complex matrix \(A=[a_{ij}]\), the adjoint matrix \(A^\dagger\) (also called the Hermitian transpose) is defined by \[ (A^\dagger)_{ij}=\overline{a_{ji}}. \] That is, \(A^\dagger\) is obtained by taking the transpose of \(A\) and then taking the complex conjugate of each entry. It satisfies: \[ \langle A \vec{u},\vec{v}\rangle = \langle \vec{u},A^{\dagger}\vec{v}\rangle \] or more familiarly: \[ A \vec{u}\cdot \vec{v} = \vec{u}\cdot A^{\dagger}\vec{v}. \]
For a matrix equation
\[
A\vec{x}=\vec{b},
\] we know from linear algebra that a solution exists if and only if \[
\vec{b}\perp\ker(A^\dagger),
\] that is, \(\vec{b}\) is orthogonal to all vectors \(\vec{y}\) satisfying \(A^\dagger\vec{y}=0\). This is the Fredholm alternative for matrices, you likely didn’t hear it called that before, but it is the name of a theorem we will encounter soon, which is a VAST generalisation of this idea.
The adjoint matrix determines if a problem is solvable or not. It also determines the inverse \(A^{-1} =A^\dagger/\det(A)\) which we use to solve \(A\vec{x}=\vec{b}\).
Now, the operator version
Definition 2.3 For operator \(L\) with BC associated with an inner product \(\langle \rangle\) the adjoint operator (\(L^*\),\(BC*\)) is defined by the inner product relation: \[ \langle Ly, w \rangle=\langle y, L^*w \rangle. \]
To determine the adjoint, one needs to move the derivatives of the operator from \(y\) to \(w\), and define adjoint boundary conditions so that all boundary terms vanish.
We seek a function \(L^*w\) such that: \[ \int_a^b (w)(y'')\mathrm{d}x = \int_a^b (y)(L^*w)\mathrm{d}x \] To do this, we need to shift the derivatives from \(y\) to \(w\) using integration by parts:
\[\begin{align} \int_a^b wy''\mathrm{d}x & = wy' \vert_a^b - \int_a^b w'y'\mathrm{d}x\\ & = wy' - w'y\vert_a^b +\int_a^b y w'' \mathrm{d}x. \end{align}\]
A comparison of the integral parts suggests: \[ L^*w = \frac{\mathrm{d}^2w}{\mathrm{d}x^2} \Rightarrow L^* = \deriv{}{x}{2}. \] The inner product only includes integral terms, so the boundary terms must vanish, which will define boundary conditions on \(w\), i.e. this defines BC*. Here, we require \[ w(b)y'(b) - w'(b)y(b) - w(a)y'(a) +w'(a)y(a)=0. \] Using the BC’s \(y'(b) = 3y(b)\) and \(y(a) = 0\), gives: \[ 0 = y(b)\Big(3w(b) - w'(b)\Big) - w(a)y'(a) +\underbrace{w'(a)y(a)}_{= 0} \] As these terms need to vanish for all values of \(y(b)\) and \(y'(a)\), we can infer two boundary conditions on \(w\):
- \(y(b)\): \(3w(b)-w'(b) = 0\)
- \(y'(a)\): \(w(a) = 0\)
so all told the answer to our problem is:
If \(L = L^*\) and \(BC=BC^*\) then the problem is self-adjoint. If \(L=L^*\) but \(BC\neq BC^*\) we still call the operator self-adjoint.
Some books use the terminology formally self-adjoint if \(L=L^*\) and fully self-adjoint if both \(L = L^*\) and \(BC=BC^*\).
Questions 6-8 of problem sheet 2 will get you familiar with calculating adjoint operators.
2.3 Eigenfunction Properties
As discussed in section 2.1, we aim to construct solutions to (linear) partial differential equations as a linear combination of eigenfunctions. There are two fundamental properties of eigenfunctions that will be vital to this approach
Proposition 2.1 Eigenfunctions of the (weighted) adjoint problem have the same eigenvalues as the original problem, both rely on the adjoint operator.
That is, \(Ly=\lambda r(x) y\Rightarrow \exists w\) s.t \(L^*w=\lambda r(x) w.\)
Proof. From the definition of the inner product: \[ \langle Ly, w \rangle=\langle \lambda r(x) y, w \rangle = \langle y, \lambda r(x) w \rangle \] but also from the adjoint definition. \[ \langle Ly, w \rangle =\langle y, L^*w \rangle \] Comparing these two equalities we find: \[ \langle y, L^*w \rangle= \langle y, \lambda r(x) w \rangle. \] So \(L^*w =r(x)\lambda w\).
Proposition 2.2 Eigenfunctions corresponding to different eigenvalues are orthogonal to each others adjoint (under a weighted inner product):
That is, if \(Ly_j = \lambda_j r(x) y_j\) (so \(L^*w_j = \lambda_j r(x)w_j\)) and \(Ly_k= \lambda_k r(x) y_k\) (\(L^*w_k = \lambda_k r(x)w_k\)), then for \(\lambda_j \neq \lambda_k\), \(\langle y_j, r(x) w_k \rangle = 0\).
Proof. \[\begin{align} \lambda_j \langle y_j, r(x)w_k \rangle& = \langle \lambda_j r(x) y_j, w_k \rangle\\ & = \langle Ly_j, w_k \rangle\\ & = \langle y_j, L^*w_k \rangle\\ & = \langle y_j, \lambda_k r(x) w_k \rangle\\ & = \lambda _k \langle y_j,r(x) w_k \rangle. \end{align}\]
But \(\lambda_j \neq \lambda_k\) so \(\langle y_j, r(x) w_k \rangle = 0\).
The proof here is exactly as for matrix problems where the inner product represents the dot product of vectors.
We have the first of our requirements, orthogonality! But its odd…
We were expecting, as seen in the Fourier series case, that: we should find orthogonality of the eigenfunctions themselves: \[ \langle y_j,r(x)y_k \rangle=0, \] but we seem to need the adjoint? Well, what if, like the heat Kernel with homogenous boundary conditions, the operator is self adjoint? Then we have shown that:
For self adjoint operators, eigenfunctions corresponding to different eigenvalues are orthogonal under the problems weighting \(r(x)\): \[ \langle y_j,r(x)y_k \rangle=0. \] if \(j\neq k\).
Do they exist. We should be careful with these proofs. They work if the \(w_k\) are trivial (zero functions). In this case the statements become meaningless but technically correct. We haven’t (and can’t generally) show the \(w_k\) (or indeed the \(y_k\)) exists non-trivially. We will pursue this issue later in section 2.5. For now it suffices to say there is a large class of important problems (choices of \(L\)) for which both \(y_k\) and \(w_k\) exist non-trivially and hence this orthogonality is meaningful (it will allow us to solve our PDE problem)
2.3.1 Inhomogeneous problems, representing a function in terms of an operator.
We are in a position to outline the construction of solution to the BVP \[Lu=f(x)\] with linear, homogeneous, separated boundary conditions, denoted \(BC_1(a)=0\), \(BC_2(b)=0\).
But first, why do we care? There are two reasons
If say \(L = \deriv{}{x}{2}\) then this is Poisson’s equation. We recall in the PDE classification that elliptic equations often arise as parabolic equations with the time derivatives set to zero. Our proposed model problem reduces to this form if we set \(h(x,t)\equiv -f(x)\) and \(\ddy{u}{t}=0\). This is an example of a steady state or equilibrium, a state to which the system should eventually settle.
This actually addresses the issue of representing the function \(h(x,t)\) which we assumed earlier. We see this via the following steps.
\[\begin{align} &\sum_{n=0}^{\infty}h_n(t)y_n(x) = h(x,t) & ,\\ &\sum_{n=0}^{\infty}h_n(t)\frac{L y_n(x)}{\lambda_n} = h(x,t) & Ly_n = \lambda_n r(x) y_n\\ &L \sum_{n=0}^{\infty}h_n(t)\frac{y_n(x)}{r(x)\lambda_n} = h(x,t) & \mbox{Linearilty of }L. \end{align}\]
If we fix \(t = t_e\) and write \[ u = \sum_{n=0}^{\infty}h_n(t_e)\frac{y_n(x)}{\lambda_n},\quad f = h(x,t_e). \] then we have an equation in the form \(L u = f\)!
Note the potential issue here if \(\lambda_n=0\), we will come back to this later when we discuss the Fredholm alternative.
Seeking to solve the equation \(Lu =f\) is to answer the question: Can we represent a (fairly arbitrary) function as an infinite sum of the eigenvalues of an operator \(L\)? A fundamental assumption in the solution of our target problem laid out in the first section of this chapter (could we decompose the forcing function \(h(x,t)\)).
To see we can solve the problem we proceed via the following steps.
- Solve the eigenvalue problem \[Ly=\lambda y,\quad BC_1(a)=0,\;BC_2(b)=0\] to obtain the eigenvalue-eigenfunction pairs \((\lambda_j, y_j)\).
- Solve the adjoint eigenvalue problem \[L^*w=\lambda w,\quad BC_1^*(a)=0,\;BC_2^*(b)=0\] to obtain \((\lambda_j, w_j)\).
- Assume a solution to the full system \(Ly=f(x)\) of the form \[y=\sum_ic_iy_i(x).\] To determine the coefficients \(c_i\), start from \(Ly=f\) and take an inner product with \(w_k\):
\[ \begin{split} Ly&=f(x)\\ \Rightarrow\langle Ly, w_k \rangle &= \langle f, w_k \rangle\\ \Rightarrow\langle y, L^*w_k \rangle &= \langle f, w_k \rangle\\ \Rightarrow\langle y, r(x)\lambda_k w_k \rangle &= \langle f, w_k \rangle\\ \Rightarrow\lambda_k\langle \sum_ic_iy_i, r(x)w_k \rangle &= \langle f, w_k \rangle\\ \Rightarrow\lambda_kc_k\langle y_k, r(x) w_k \rangle &= \langle f, w_k \rangle \end{split} \tag{2.3}\] 4. We can solve the last equality for the \(c_k\), and we are done!
We have the second of our requirements, a coefficient formula!
Note that in the last step we have used convergence of \(y\) to switch the order of summation and integration, and used the orthogonality property \(\langle y_j, r(x) w_k\rangle=0,\;\;j\neq k\).
If the problem is full self-adjoint then \(y_k=w_k\). We can then go as far as to say our eigen-functions \(y_k\) themselves are orthogonal and.
\[ \lambda_k c_k\langle y_k, r(x) y_k \rangle = \langle f, y_k \rangle. \] This was the recipe used in our example in section 2.1.7. where \[ \langle f, y_k \rangle = \int_{a}^{b}\cos\left(\frac{n\pi (x-a)}{b-a}\right)f(x) d x. \] and \[ \langle y_k, y_k \rangle = \int_{a}^{b}\cos^2\left(\frac{n\pi (x-a)}{b-a}\right) d x = \frac{b-a}{2}. \] But generically we need to use the adjoint identity.
Flex your eigen-muscles with an example of both these types (self adjoint and not) in questions 9-13 of problem sheet 2
2.3.2 inhomogeneous boundary conditions.
We now consider the general case of an inhomogeneous system with inhomogeneous boundary conditions. As an example of why we might care, recall that the Fourier-Bessel series was only valid for functions which were zero on the boundary of the disc. What if we (quite reasonably) want to represent a function which is not.
The general problem takes the form: \[ L u =f(x),\quad B_iu =\gamma_i, i\in 1,2\dots n \tag{2.4}\]
We can deal with this by exploiting the linearity of the system to split into a homogenous problem
\[ Lu_1=f(x),\quad B_iu_1=0 \tag{2.5}\] and a simpler problem with the inhomegenaity \[ Lu_2=0,\quad B_iu_2=\gamma_i. \tag{2.6}\]
Here, solving for \(u_1(x)\) has the difficulty of the forcing function but with zero BC’s while the other equation is homogeneous but has the non-zero BC’s. Due to linearity, it is easy to see that \(u(x)=u_1(x)+u_2(x)\) solves the full system Equation 2.4
This decomposition can always be performed (although, as we shall see later, it requires caution if there is a zero eigenvalue \(\lambda=0\)), and since solving Equation 2.6 tends to be an easier matter (for linear systems!), it is safe for us to primarily focus on the technique of solving the system Equation 2.5, i.e. homogeneous boundary conditions.
- Solve two systems separately: \[\begin{eqnarray*} &&y''=f(x), \quad y(0)=y(1)=0\\ &&u''=0,\quad u(0)=\alpha, \;u(1)=\beta \end{eqnarray*}\]
- To solve for \(y\), since BC=0 we have seen this before, we adopt the minus sign convention for the eigenproblem \(Ly_n = -\lambda_n y_n\), with this choice you should confirm \(y=\sin(k\pi x)\) and \(\lambda = k^2\pi^2\). Then we jump straight to the formula \[ c_k=-\frac{\left< f, w_k\right>}{\lambda_k\left< y_k, w_k \right>}=-\frac{2\int_0^1f(x)\sin(k\pi x)\;dx}{k^2\pi^2} .\]
- The solution for \(u\) is easily obtained as \[ u=(\beta-\alpha)x+\alpha \]
- The full solution is \(y(x)+u(x)\).
Quesiton 9 is inhomogenous.
We next look at a particular class of operator – Sturm-Liouville operators – that occur quite commonly and have very useful properties.
2.4 Sturm-Liouville theory
Wouldn’t it be great if the operators we dealt with were self adjoint? That means only solving one eignevalue problem. Well, the Sturm-Liouville (SL) theory of second order concerns self-adjoint operators and a weighted eigenvalue problem. We will show that actually lots of non-self ajoint operators can be converted to this form and further that it has a lot of the properties we are seeking.
The weighted Sturm-Liouville eigenvalue problem takes the form: \[ Ly = \lambda r(x) y \] where \(r(x) \geqslant 0\) is a weighting function, and the operator \(L\) is of the form \[ Ly=\frac{\mathrm{d}}{\mathrm{d}x}\left(p(x) \frac{\mathrm{d}y}{\mathrm{d}x} \right) + q(x)y,\;\;a\leq x\leq b \] The functions \(p,q\), and \(r\) are all assumed to be real. It is easy to check that the operator is formally self-adjoint. It is fully self-adjoint if the boundary conditions take the separated form
\[\begin{eqnarray*} \alpha_1 y(a) +\alpha_2 y'(a) & = & 0\\ \alpha_3 y(b) +\alpha_4 y'(b) & = & 0. \end{eqnarray*}\]
Question 7 actually asks you to do this, in fact we have already seen this in your problems class!.
Observe also that if \(p(a)=p(b)=0\), then \(\langle Ly,w\rangle=\langle y,Lw\rangle\) irrespective of boundary conditions. This defines the so-called natural interval \([a,b]\) for the problem.
The family of equations fitting this description involves some of the most important and applied mathematics and Physics. As we shall see they have some very nice properties which is why we focus on them in particular.
2.4.1 inhomogeneous SL problems
Since a SL operator is self-adjoint, the eigenfunction expansion process is quite straightforward. Consider \(Ly=f(x)\) with homogeneous BC’s. The system can be solved with an eigenfunction expansion in the same manner as in the previous section:
\[ \begin{split} Ly&=f(x)\\ \Rightarrow\langle Ly, y_k \rangle &= \langle f, y_k \rangle\\ \Rightarrow\langle y, Ly_k \rangle &= \langle f, y_k \rangle\quad\text{ (since } L^*=L, w_k=y_k)\\ \Rightarrow\langle y, \lambda_k r y_k \rangle &= \langle f, y_k \rangle\\ \Rightarrow\lambda_kc_k\langle y_k, r y_k \rangle &= \langle f, y_k \rangle. \end{split} \]
Thus we obtain the formula \[ c_k=\frac{\langle f, y_k \rangle}{\lambda_k\langle y_k, r y_k \rangle} \] and the full solution is given by \[y=\sum_{k} c_ky_k.\]
2.4.2 Transforming an operator to SL form
Many problems encountered in physical systems are Sturm-Liouville. In fact, though, any operator \[Ly\equiv a_2(x)y''(x)+a_1(x)y'(x)+a_0(x)y(x)\] with \(a_2(x)\neq0\) in the interval can be converted to a SL operator.
To transform to a self-adjoint SL operator, multiply by an integrating factor function \(\mu(x)\): \[\mu a_2(x) y''(x)+\mu a_1 y'(x)+\mu a_0 y\] We then choose \(\mu\) so that the first and second derivatives collapse, i.e. so it can be expressed in the form
\[
\frac{d}{dx}(p y')+ q y = py''+p'y' +qy = \mu a_2(x) y''(x)+\mu a_1 y'(x)+\mu a_0 y.
\]
and we can solve for \(\mu,p\) and \(q\) by equating the coefficients of \(y'',y',y\) and then solving the ensuing ODE’s.
Question 12 of problem sheet 2 will walk you through this process. Once you have understood try all the examples in question 14,15 and 17.
Suppose we are considering the problem \[ Ly=f(x) \] where \(L\) is not Sturm-Liouville. We could solve following the more general approach (involving finding the adjoint functions \(w_k\)). Alternatively we could convert to Sturm-Liouville first, and then proceed using the nice properties of a self-adjoint operator. So, is the problem self-adjoint or isn’t it?? The key observation is that we are no longer solving the same problem. We have transformed to a new operator \[\hat{L}y=-\frac{d}{dx}(p y')+q y\] (we had to multiply it by the integrating factor \(\mu\)) which does not satisfy the same equation as the original, that is \(Ly=f\) while \(\hat{L}y=\mu f\). They are both valid, and must ultimately lead to the same answer in the end.
Question 13 of problem sheet 2 is more lengthy and asks you to solve a non self-adjoint eigenproblem by the two routes: 1. Directly via the adjoint problem 2. Via conversion to Sturm-Lioville and exploting the self adjoint property. You end that question by showing they give the same answer in the end i.e. you get the same \(y\).
2.4.3 Further properties of the Sturm-Liouville operator:
Orthogonality
Due to the presence of the weighting function, the relation is \[ \int_a^b y_k(x) y_j(x) r(x) \mathrm{d} x = 0. \]
This follows from proposition 2.2 and the self-adjoint property.
Real Eigenvalues
The functions \(p,q,r\) are real, so \(\overline{L}=L\). Thus, taking the conjugate of both sides of \(Ly_k=\lambda_ky_k\) gives
\[ \begin{split} &L\;\overline{y_k} = \overline{\lambda_k}\; r\;\overline{y_k}\\ \Rightarrow\;\; &\left< y_k, L\,\overline{y_k}\right>=\overline{\lambda_k}\left< y_k,\,r\overline{y_k}\right>\\ &\text{but }\left< y_k, L\,\overline{y_k}\right>=\left< Ly_k,\;\overline{y_k}\right>=\lambda_k\left< ry_k,\;\overline{y_k}\right>=\lambda_k\left< y_k,\,r\overline{y_k}\right>\\ \;\;& \\ \quad\quad\Rightarrow\;\;&\overline{\lambda_k}=\lambda_k \end{split} \]
Thus all the eigenvalues are real.
Moreover, if \(a\le x\le b\) is a finite domain, then \(\lambda\)’s are discrete and countable:with \(\lim_{k\to\infty} \lambda_k =\infty\).
Eigenfunctions
The \(\{y_k\}\) are a complete set, that is all \(h(x)\) with \(\int f^2 r\,dx <\infty\) can be expanded as \[h(x)=\sum c_k y_k(x).\] Take an inner product with \(r(x) y_j(x)\): \[ \left< ry_j, f \right> = \left< ry_j, \sum c_k y_k \right> = \sum c_k \left< r y_j, y_k\right>\\ = c_j \left< r y_j, y_j \right> \] \[\Rightarrow c_j=\frac{\int_a^b h(x) y_j(x) r(x)\;dx}{\int_a^b y_j^2(x) r(x)\;dx}\] Note: I’ve used \(h(x)\) to make clear that we’re not talking about the solution to the BVP, rather we are expanding any function that is suitably bounded on the same domain.
We have shown that any Sturm Lioville problem comes equipped with basis of orthogonal functions, which can represent any reasonable function on the domain which it was defined. We also have a which have a coefficient formula to construct this representation! This is exactly what we set out to find for our target PDE model.
2.5 Existence of the eigen-expansion via the Fredholm alternative.
So far we have shown that, if we can solve the eigen-equation \[ L y_n = \lambda_n y_n. \] and indeed the problem \[ L u = f, \] subject to some appropriate boundary conditions. Then we can proceed to construct that solution using orthogonality and the coefficient formulae. However, all along we have made an assumption that we can solve this equation, but that is not always the case.
So can we know in the first place if it can be solved? This is a hard question as it depends on the boundary conditions and sometimes one cannot determine easily if it has a solution by direct construction (actually solving the damn thing!). But, there is a tool which can answer it: the Fredholm alternative, named after Ivar Fredholm. Its one of the great theorems in linear operator theory (which is a vast subject area).
2.5.1 The alternative statement
Proposition 2.3 The Fredholm alternative holds that:
- Either, the homogenous adjoint problem \(L^* w =0\) has a non-trivial solution.
- Or, The inhomogeneous boundary value problem \(L y = f(x)\), with relevant boundary conditions, has a unique solution.
Either 1 or 2 is true, but never both simultaneously.
Proposition 2.4 If we are in the case 1 (the homogenous adjoint problem is non-trivial). Then there is a further alternative!
Either
- \[ <w,f> = 0 \tag{2.7}\] and there is a non-unique solution to the problem \(L y = f\), with relevant boundary conditions.
- Or, this integral does not vanish and there is no solution to the problem \(L y = f(x)\), with the relevant boundary conditions.
2.5.2 The “rough” explanation for why the alternatives
Consider the adjoint identity \[ \left<L u,w\right> = \left<u, L^{*} w\right> \] If \(L^{*}w=0\) then \(\left<L u,w\right>=0\). If \(w\) is non-trivial, i.e. branch 1. of our first alternative, this seems to imply \(L u=0\), in which case we could not solve \(L u = f(x)\) for non-trivial \(f(x)\). But of course if \(\left<L u,w\right>=\left<f(x),w\right> = 0\) for some non-trivial \(w\), then \(L u\) doesn’t have to be zero. This implies the condition Equation 2.7 which only constrains \(L\) up to a constant, due to linearity.
On the other hand if \(w\) were originally trivial then there is no constraint on \(L u\) as \(<L u,0>=0\) whatever \(L u\) is and no reason it cannot solve \(Lu = f(x)\).
That this condition, \(L^* w =0\) only if \(w\) is trivial, implies we can always solve \(L u =f\) and that the solution is unique is somewhat tougher to show (it relies on the relationship between the kernel of \(L^*\) and the range of \(L\)).
The rest of the proof is beyond the scope of this course.
2.5.3 An example
Consider the following inhomogeneous BVP: \[ \deriv{y}{x}{2} + g_0^2 y = \sin(x). \tag{2.8}\] with \(g_0\) constant, and with the boundary conditions: \[ y(0) , \quad y(\pi)=0. \tag{2.9}\]
Show this is a fully self-adjoint operator
Step 1 solve the adjoint problem We consider the homogenous problem \(L^*w=0\): \[ \deriv{w}{x}{2} + g_0^2 w = 0,\quad w(0) , \quad w(\pi)=0. \] \(g_0>0\).The general solution is \[ w(x) = A\sin(g_0 x) + B\cos(g_0 x). \] The boundary condition \(w(0)=0\) implies \(B=0\). The second condition: \[ A \sin(g_0 \pi x)=0 \] requires either \(A=0\) (the trivial case) or \(g_0 = 0,1,2,\dots n\). So..
If \(g_0\) is not an integer we are in branch 2 of alternative statement 1. Thus there is a unique solution to Equation 2.8 with boundary conditions Equation 2.9. If \(g_0\) is an integer then there is (potentially) non-unique solution to this problem. That rests on an application of the second alternative.
Let’s assume \(g_0\) is an integer. We now have to consider the alternative alternative! We calculate the integral \[ <w,\sin(x)> = \int_{0}^{\pi}\sin(g_0 x)\sin(x)\rmd x = \frac{\sin(g_0\pi)}{g_0^2-1}. \] So if \(g_0 \neq \pm 1\) then this is zero (remembering we are assuming \(g_0\) is an integer). Which would imply we can solve the problem. If \(g_0=1\) we can see the integral is not zero which implies we cannot solve the problem at all.
Let’s test this:
To find the general solution to : \[ \deriv{y}{x}{2} + g_0^2 y = \sin(x). \] We first seek the complementary solution \(y_c\) which solves: \[ \deriv{y_c}{x}{2} + g_0^2 y_c = 0. \] We did this above for the adjoint \(w\) (as this is self adjoint). Thus \[ y_c(x) = A\sin(g_0 x)+ B\cos(g_0 x) \]
We don’t yet apply the boundary conditions, we need the particular solution first.
For the particular solution we seek a solution in the form \(y_p= a_1 \sin(x) + a_2\cos(x)\): which yeilds
\[ \left(a_1 \left(g_0^2-1\right)-1\right) \sin (x)+a_2 \left(g_0^2-1\right) \cos (x) =0 \tag{2.10}\]
If \(g_0\neq 1\) then we set \(a_2=0\) and \(a_1 = 1/(g_0^2-1)\) and we have a full solution to the problem: \[ y(x) = A\sin(g_0 x)+ B\cos(g_0 x) + \frac{1}{g_0^2-1}\sin(x) \]
Confirm this satisfies the boundary conditions if \(g_0\) is an integer (and not equal to \(1\) or \(-1\)), for any A. If \(g_0\) is not an integer, confirm you need \(A=B=0\), thus uniquely determining the problem.
Thus we have, as Fredholm told us to expect, a solution which is unique if \(g_0\) is not integer, and non-unique if it is integer, due to the constants \(A,B\) being free.
But what if \(g_0=1\)? It is clear Equation 2.10 cannot be satisfied. I hope, however, you remember that in such cases we should try a solution in the form: \[ y_p= a_1 x\sin(x) + a_2 x\cos(x), \] as \(\sin(x)\) the right hand side is an eigenfunction of the homogenous operator, an issue that arises due to the operator having a zero eigenvalue.
Substituting this in we find: \[ 2 a_1 \cos (x)-(2 a_2+1) \sin (x) \] (whether \(g_0=1,-1\)). So we need \(a_1=0\) and \(a_2=-1/2\), so our general solution is: \[ y(x) = A\sin(x)+ B\cos(x) - \frac{1}{2}x\cos(x) \tag{2.11}\]
But wait! doesn’t this contradict the Fredholm alternative, which told us in this case we cannot have any solution?
Confirm that Equation 2.11 cannot satisfy the boundary conditions \(y(0)=y(\pi)=0\). Thus is not a valid solution to the problem.
And you doubted Ivar Fredholm! Shame on you…..
Alot of the failiures to be able to solve come not from there being no general solution to an equation, but that it is not commensurate with the boundary conditions. At the risk of repeating myself the properties of an operator are not fully defined without their boundary conditions
2.5.4 The coefficient formula revisited.
We recall earlier we found the following general coefficient formula \[ \lambda_k c_k \langle y_k,r(x) w_k\rangle = \langle f,w_k\rangle. \] I asked you to chill on the matter of worrying about the \(\lambda_k=0\) case. Well, now we can address it with Fredholm in mind.
If \(\lambda_k=0\) then we need that \(\langle f,w_k\rangle=0\). If the adjoint problem is trivial, then this is true (and recall we solve the homogenous problem). If not we need \(\langle f,w_k\rangle=0\). These are two of the branches of the alternative! If \(\langle f,w_k\rangle\neq 0\) we can’t ultimately satisfy our equation (a conclusion from Fredholm).
So we see Fredholm is intimately related to the ability to construct our series solution
2.6 A reflection on our target mode and its extensions on the model
2.6.1 The parabolic time-dependent model revisited.
We return back to our original target, the model: \[ \ddy{u}{t} = L u + h(x,t), \] and review how what we have seen relates to this problem.
We have been investigating how to solve equations in the form \(Ly + f=0\). We see that if we suppress the time dependence \(\partial u/\partial t=0\) then this equation in the form: \[ Lu(x) + h(x)=0 \] Which is in the form or our inhomogenous boundary value problem \(L_u= f\). As we have seen, it has solutions in the from: \[ u(x) = \sum_{n=0}^{\infty}c_n y_n(x). \] Where the \(y_n\) are eigenfunctions of the operator \(L\): \[ L y_n = \lambda y_n. \] This is what is known as a steady-state or equlibrium, physically it represents the state a system settles to (i.e. what it lloks like when it stops changing). It is very common we consider such problems in practice.
Extending to time dependence: \[ Lu(x,t) + h(x,t)=\ddy{u(x,t)}{t} \] Leads to solutions are in the same form as the elliptic case, but now the coefficients vary in time. \[ u(x,t) =\sum_{n=0}^{\infty}c_n(t) y_n(x).\quad \] Note the \(y_n\) are the same eigenfunctions of \(L\).
We used same eigen-expansion method in both cases. The difference is that in the case of the elliptic equation we solve an algebraic equation for the coefficients: \[ \lambda_n c_n \langle y_n,r(x)w_n\rangle = \langle f,w_n\rangle. \] In the parabolic case we have to solve an ordinary differential equation: \[ \ddt{c_n(t)} +\lambda_n c_n(t) - h_n(t) = 0 \] where the \(h_n\) ar calculated using the eigen-expansion: \[ h_n(t)\langle y_n,r(x)w_n\rangle = \langle h(x,t),w_n\rangle. \]
Nearly there So we can use Fredholm to decide if we can have a series solution to our PDE? Sadly not quite. Some PDE’s don’t ever find a steady state, they find repeating or chaotic dynamics which never settle. Fredholm might tell you the PDE doesn’t have a time independent solution, but it can’t tell you if there is a time-dependent solution.
The good news is that if \(L\) is self adjoint then we can say it will have a solution (proof is beyond the course scope). So we need to compute the adjoint anyway, and even if its not self adjoint, if the oprator is normal then we can do it. To check this condition we have to use the adjoint as well. So basically it uses the same mathematical basis we outline above, but it needs more machinery.
To be constructive I have put together document with more details on this matter on the ultra page. It is enitirely non-examinable but I hope it is interesting to those who want to explore further.
2.6.2 How general is this?
The model \(Lu =f\) is a very general elliptic equation. The time-dependent extension we have considered so far is very simple, just a single time derivative. The good news is our eigen-expansion technique can work for much more general extensions.
There are two obvious ways we can extend this model. The first is to change the time operator, it requires little adjustment to what we have discussed. The second is to make \(L\) and this problem as a whole of a higher spatial dimension. That is more complex, although, as we shall see, there are some cases when it is actually just a simple extension of this eigen-expansion method.
We’ll start with the simpler case!
mispellled wordtxt
2.6.3 Extending the time dependence of the model.
Start with a simple example: \[ \pder{y}{t}{2} = L y + h(x,t). \] This requires an extra initial condition e.g. \[ u(x,0) = g_1(x),\quad \ddy{u}{t}{\Large\vert}_{t=0} = g_2(x). \] We follow the steps outlined in the initial section:
- Assume a series solution: \[ u(x,t) = \sum_{n=0}^{\infty}c_n(t)y_n(x),\quad L y_n=-\lambda_n y_n. \]
- Sub into the equation and decompose: \[ h(x,t) = \sum_{n=0}^{\infty}h_n(t)y_n(x). \]
- Solve the coefficient formula, the only place the recipie changes: \[ \deriv{c_n}{t}{2} +\lambda_n c_n - h_n = 0 \]
In this case the homogenous part \(c_{nh}\) takes the form: \[ c_{nh}(t) = a_{1n} \sin(\sqrt{\lambda_n}t) + a_{2n}\cos(\sqrt{\lambda_n}t). \]
We saw in chapter 1 this leads to wave like behaviour!
More generally we can consider a time-dependent linear operator \(L_t\), and we will have the following mode equation \[ L_t c_n+ \lambda c_n - h_n(t)=0. \] One can recast this in the form \[ L^+c_n = h_n (t), \] where \(L^+ = L-\lambda\).
This is just an inhomogeneous ordinary differential equation in the form \(L y = f\), the type we have been considering for most of the chapter! Just replace \(x\) with \(t\).
We can do it!! But there is one big difference. In this case it is an initial value problem (IVP), not a boundary value problem, as all derivatives are applied at \(t=0\). For this there is a much stronger theorem, the Picard-Lindeof theorem. It’s a little tricky to state here and even more fiddly to prove (the system needs transforming in general into a system of first order equations). It suffices to say, essentially, as long as the functions involved in \(L_t\) are Lifschitz continous then the solution exists and is unique for all time.
If we try to impose a time condition at some \(t_1>0\) it becomes a boundary value problem in time. For example, for the operator \(L_t\), two conditions would suffice and we could impose: \[ u(x,0) = g_0(x),\quad u(x,t_1)=g_1(x). \] That is, we demand, after some time \(t_1\), the function \(u\) takes a very specific spatial profile \(g_1(x)\). Then we cannot guarantee the solution exists, and, in fact, we have to apply the Fredholm alternative to see if we can solve the problem
The mathematics of the problems \[ L u(x) = f(x),\quad L_t u(t) =f(t) \] are identical. The asymmetry in their solvability usually comes form the fact it is very natural to ask the question, how will a system change given from where it currently is: an initial value problem in time. Whilst spatially we ask the question about the behaviour of the function/density \(u\) on a finite spatial domain \(x\in[a,b]\), implying a boundary value problem.
2.6.4 Higher order spatial operators.
We consider the model \[ \ddy{u}{t} = L u + h(\vec{x},t), \] where \(u(\vec{x},t)\) depends on some \(m\)-dimensional vector \(\vec{x} \in\mathbb{R}^m\), and for which \(t>0\). The operator \(L :\mathbb{R}^m\rightarrow \mathbb{R}^m\) is an \(m\)-dimensional spatial operator.
The aim will be to assume some solution in the form: \[ u(\vec{x},t) = \sum_{\vec{n}\in \mathbb{N}^m} c_{\vec{n}}(t) y_{\vec{n}}(\vec{x}), \] where the \(y_n\) satisfy the eigen-equation \[ Ly_{\vec{n}}(\vec{x}) = -\lambda y_{\vec{n}}(\vec{x}). \] In essence we end up with the exact same recipe as we did in the one-dimensional case. The big difference is it is generally much harder to find the set of functions \(y_n\). On this I have some good news!
Look at propositions 2.1 and 2.2. and the statement of the Fredholm alternative. Note they never actually need the fact \(L\) is one dimensional and that: \[ \] Just linearity and the existence of a well defined inner product. They apply as well in \(n\)dimensions! In this case consider a domain \(\mathcal{D}\in \mathbb{R}^n\). We define the inner product for two smooth scalar functions \(u(\vec{x})\) and \(v(\vec{x})\) to be: \[ \left<u(\vec{x}),v(\vec{x})\right> = \int_{\mathcal{D}}u(\vec{x})\overline{v(\vec{x})}\mathrm{d} V \] where \(\mathrm{d}V\) is the volume element (this can be extended to a weighting function also). This gives an adjoint definition: \[ \left<L u,v\right> = \left<u,L^*v\right> \] which applies to the propositions, and indeed works for the Fredholm Alternative.
Let’s try a simple example of finding the adjoint.
2.6.5 Solving n-d eigenvalue problems on box domains.
Consider our domain to be a box \([0,L_1]\times[0,L_2]\times \dots[0,L_m]\) with coordinates \((x_1,x_2,x_3, \dots x_m)\).
We consider the heat kernel in Cartesian coordinates: \[ L = \nabla^2 = \nabla \cdot \nabla \Rightarrow \nabla^2 u = \pder{u}{x_1}{2} +\pder{u}{x_2}{2} \dots\pder{u}{x_m}{2} \] We impose Dirichlet boundary conditions: \[\begin{align} & u(0,x_2,x_3\dots x_m)=u(L_1,x_2,x_3\dots x_m) = u(x_1,0,x_3\dots x_m)= u(x_1,L_2,x_3\dots x_m) \\ & \dots=u(x_1,x_2,x_3\dots 0) = u(x_1,x_2,x_3\dots L_m) =0. \end{align}\] We can see the solution to the problem: \[ \pder{u}{x_1}{2} +\pder{u}{x_2}{2} \dots\pder{u}{x_m}{2} = -\lambda_{\vec{n}} u \] is \[ y_n = C_{n_1,n_2,\dots n_m}\sin\left(\frac{n_1\pi x_1}{L_1}\right)\sin\left(\frac{n_2\pi x_2}{L_2}\right)\dots \sin\left(\frac{n_m\pi x_m}{L_m}\right) \] (we will cover how to derive this in problems class), with the \(n_i\) integers, and \[ \lambda_{\vec{n}} = \pi^2\left(\frac{n_1^2}{L_1^2}+\frac{n_2^2}{L_2^2}+\dots \frac{n_m^2}{L_m^2} \right), \] The total summed solution (via superposition) is: \[ y_n = \sum_{n_1=1}^{\infty} \sum_{n_2=1}^{\infty}\dots \sum_{n_m=1}^{\infty}C_{n_1,n_2,\dots n_m}\sin\left(\frac{n_1\pi x_1}{L_1}\right)\sin\left(\frac{n_2\pi x_2}{L_2}\right)\dots \sin\left(\frac{n_m\pi x_m}{L_m}\right) \]
2.6.6 More?
We’ll actually the most of really tractable (analytically solvable cases) are in the first chapter, the disc and sphere! One can do it for the clyinder also and only relly for things like the Laplacian. Basically nice separable cases. So I don’t really have any more practcial ones to add, at least here. Green’s method can allow us to go further…
We have so far dealt with a box and a disc. The message, roughly, is that simple shapes are doable more complex shapes are much harder, they don’t generally separate out into a system of one dimensional problems, and in those cases a different approach, Green’s method is sometime s more productive.
Handily thats the topic of chapter 3!
Before I we move on I want to briefly add a NON-EXAMINABLE context to the importance of all this for non linear PDE’s.
2.7 Linear and Non-linear Problems — Why the Linear Case Still Matters
The eagle-eyed student will note that:
- We have largely discussed linear methodologies, and in particular the principle of superposition.
- Yet many of the most interesting and realistic examples in physics, chemistry, and biology are non-linear.
So does any of this apply to non-linear PDEs, which generally model real systems more faithfully?
Well — most non-linear PDEs must be treated using numerical methods, since analytic solutions are rarely possible.
A common and powerful class are spectral methods, which express the solution as an expansion in a set of basis functions (often sines, cosines, or other eigenfunctions of a linear operator).
The non-linearity then appears through interactions between the coefficients of these basis functions.
2.7.1 Why linear eigen-decompositions still matter
Even though non-linear equations violate superposition, eigen-decomposition remains useful for several reasons:
Efficient representation of structure:
The eigenfunctions of a linear operator (such as the Laplacian) often reflect the geometry and boundary conditions of the problem.
Expanding the solution in this natural basis concentrates most of the important physical behaviour in just a few modes, making the numerical system more compact and efficient.Mode coupling insight:
In a non-linear PDE, such as the Navier–Stokes or reaction–diffusion equations, non-linearity introduces coupling between modes — energy or information is transferred between eigenmodes.
Expressing the solution in an eigenbasis makes these couplings explicit, helping to analyse phenomena like turbulence, pattern formation, or instability growth.Stability and time integration:
The linear part of a PDE often dominates its stability properties.
By diagonalizing this part (e.g. through a Fourier or Chebyshev transform), one can integrate it exactly in time, while treating the remaining non-linear terms explicitly or implicitly — the core idea behind operator splitting and exponential integrators.Error control and convergence:
Eigenfunction expansions provide natural orthogonal bases, so truncating after \(N\) modes gives a clear and quantifiable approximation error.
This is a key advantage over purely local discretizations like finite differences.Analytic and diagnostic value:
Even when not used for computation, decomposing numerical or experimental solutions into eigenmodes helps to interpret results — identifying dominant spatial scales, symmetry breaking, or coherent structures.
2.7.2 Spectral and modal methods in practice
Modern spectral and pseudo-spectral solvers (for example, Fourier–Galerkin or Chebyshev–Tau schemes) exploit fast transforms to project the PDE onto a low-dimensional system for the modal coefficients: \[ \frac{d a_n}{d t} = \lambda_n a_n + \text{nonlinear couplings}(a_1, a_2, \ldots). \] This form highlights the linear dynamics through the eigenvalues \(\lambda_n\), while retaining the nonlinear interactions between modes.
Many of the basis functions used — Fourier, Legendre, Chebyshev, or spherical harmonics — are directly derived from the eigenfunctions of linear operators, especially the Laplacian or Helmholtz operators under specific boundary conditions.
In short:
The linear theory we focus on in this course is not a mere simplification; it provides the conceptual and computational foundation for analysing more complex non-linear PDEs.
The tools of linear analysis — eigen-decomposition, orthogonal projection, and modal truncation — remain central even in modern non-linear and numerical approaches.