Advanced Mathematics for Engineers and Scientists/Printable Version


Advanced Mathematics for Engineers and Scientists

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Advanced_Mathematics_for_Engineers_and_Scientists

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

Printable Version

Template loop detected: Template:Printable


The Front Cover

Nomenclature edit

ODE Ordinary Differential Equation
PDE Partial Differential Equation
BC Boundary Condition
IVP Initial Value Problem
BVP Boundary Value Problem
IBVP Initial Boundary Value Problem

Common Operators edit

Operators are shown applied to the scalar or the vector field .

Notation Common names and other notation Description and notes Definition in Cartesian coordinates
Partial derivative, The rate of change of with respect to , holding the other independent variables constant.
Derivative, total derivative, The rate of change of with respect to . If is multivariate, this derivative will typically depend on the other variables following a path.
Gradient, del operator, Vector that describes the direction and magnitude of the greatest rate of change of a function of more than one variable. The symbol is called nabla.
Laplacian, Scalar Laplacian, Laplace operator, A measure of the concavity of , equivalently a comparison of the value of at some point to neighboring values.
Divergence, A measure of "generation", in other words how much the vector field acts as a source or sink at a point.
Curl, rotor, circulation density, A vector that describes the rate of rotation of a (normally 3D) vector field and the corresponding axis of rotation.
Vector Laplacian Similar to the (scalar) Laplacian. Note however, that it is generally not equal to the element-by-element Laplacian of a vector.

3D Operators in Different Coordinate Systems edit

Cartesian representations appear in the table above. The convention is used for spherical coordinates.

Operator Cylindrical Spherical


Introduction to Partial Differential Equations

This book is intended as a Partial Differential Equations (PDEs) reference for individuals who already possess a firm understanding of ordinary differential equations and have at least a basic idea of what a partial derivative is.

This book is meant to be easily readable to engineers and scientists while still being (almost) interesting enough for mathematics students. Be advised that in-depth proofs of such matters as series convergence, uniqueness, and existence will not be given; this fact will appall some and elate others. This book is meant more toward solving or at the very least extracting information out of problems involving partial differential equations. The first few chapters are built to be especially simple to understand so that, say, the interested engineering undergraduate can benefit; however, later on important and more mathematical topics such as vector spaces will be introduced and used.

What follows is a quick intro for the uninitiated, with analogies to ordinary differential equations.

What is a Partial Differential Equation? edit

Ordinary differential equations (ODEs) arise naturally whenever a rate of change of some entity is known. This may be the rate of increase of a population, the rate of change of velocity, or maybe even the rate at which soldiers die on a battlefield. ODEs describe such changes of discrete entities. Respectively, this may be the capita of a population, the velocity of a particle, or the size of a military force.

More than one entity may be described with more than one ODE. For example, cloth is very often simulated in computer graphics as a grid of particles interconnected by springs, with Newton's law (an ODE) applied to each "cloth particle". In three dimensions, this would result in 3 second order ODEs written and solved for each particle.

Partial differential equations (PDEs) are analogous to ODEs in that they involve rates of change; however, they differ in that they treat continuous media. For example, the cloth could just as well be considered to be some kind of continuous sheet. This approach would most likely lead to only 3 (maybe 4) partial differential equations, which would represent the entire continuous sheet, instead of a set of ODEs for each particle.

This continuum approach is a very different way of looking at things. It may or may not be favorable: in the case of cloth, the resulting PDE system would be too difficult to solve, and so the computer graphics industry goes with a particle based approach (but a prime counterexample is a fluid, which would be represented by a PDE system most of the time).
While PDEs may not be straightforward to solve on a computer, they have a major advantage over ODEs when applicable: it is nearly impossible to gain any analytical insight from a huge system of particles, while a relatively small PDE system can reveal much insight, even if it won't yield an analytic solution.

But PDEs don't strictly describe continuum mechanics. As with anything mathematical, they are what you make of them.

The Character of Partial Differential Equations edit

The solution of an ODE can be represented as a function of one variable. For example, the position of the Earth may be represented by coordinates with respect to, say, the sun, and each of these coordinates would be functions of time. Note that the effects of other celestial bodies would certainly affect the solution, but it would still be expressible strictly as a function of time.

The solution of a PDE will, in general, depend on more than one variable. An example is a vibrating string: the deflection of the string will depend both on time and which part of the string you're looking at.

The solution of an ODE is called a trajectory. It may be represented graphically by one or more curves. The solution of a PDE, however, could be a surface, a volume, or something else, depending on how many variables are involved and how they're interpreted.

In general, PDEs are complicated to solve. Concepts such as separation of variables or integral transformations tend to work very differently. One significant difficulty is that the solution of a PDE depends very strongly on the initial/boundary conditions (ICs/BCs). An ODE typically yields a general solution, which involves one or more constants which may be determined from one or more ICs/BCs. PDEs, however, do not easily yield such general solutions. A solution method that works for one initial boundary value problem (IBVP) may be useless for a different IBVP.

PDEs tend to be more difficult to solve numerically as well. Most of the time, an ODE can be expressed in terms of its highest order derivative, and can be solved on a computer very easily with knowledge of the ICs (boundary value problems are a little more complicated), using well established and more or less generally applicable methods, such as Runge Kutta (RK). With this in mind, an ODE may be solved quickly by entering the equation and its ICs/BCs into the right application and pressing the solve button. An IBVP for a PDE, however, will typically require its own specialized solution, and it may take much effort to make the solution more than, say, second order accurate.

An Early Example edit

Many of the concepts of the previous section may be summarized in this example. We won't deal with the PDE just yet.

Consider heat flow along a laterally insulated rod. In other words, the heat is only flowing along the rod but not into the surrounding air. Let's call the temperature of the rod , and let , where is time and represents the position along the rod. As the temperature depends both on time and position along the rod, this is exactly what says. It is the change of the heat distribution over time. See the graphic below to get an idea.
Let's say that the rod has unitless length , and that its initial temperature (again unitless) is known to be . This states the initial condition, which depends on . The function is a simple hump between 0 and 1. Check for yourself with maxima (http://maxima.sourceforge.net or on android): plot2d(sin(x*%pi),[x,0,1])
Let's also say that the temperature is somehow fixed to at both ends of the rod, i.e. at and at . This would result in , which specifies boundary conditions. The BCs state that for all t, at and .

A PDE can be written to describe the situation. This and the IC/BCs form an initial boundary value problem (IBVP). The solution to this IBVP is (with a physical constant taken to be ):

Note that:

It also satisfies the PDE, but (again) that'll come later.

This solution may be interpreted as a surface, it's shown in the figure below with going from to , and going from to . That is, the distribution of heat is changing over time as the heat flows and dissipates.

from to and to .

Surfaces may or may not be the best way to convey information, and in this case a possibly better way to draw the picture would be to graph as a curve at several different choices of , this is portrayed below.

in the domain of interest for various interesting values of .

PDEs are extremely diverse, and their ICs and BCs can radically affect their solution method. As a result, the best (read: easiest) way to learn is by looking at many different problems and how they're solved.


Parallel Plate Flow: Easy IC

Parallel Plate Flow: Easy IC edit

Formulation edit

As with ODEs, separation of variables is easy to understand and works well whenever it works. For ODEs, we use the substitution rule to allow antidifferentiation, but for PDEs it's a very different process involving letting dependencies pass through the partial derivatives.

A fluid mechanics example will be used.

Consider two plates parallel to each other of huge extent, separated by a distance of 1. Fluid is smoothly flowing between these two plates in only one direction (call it x). This may be seen in the picture below.

Visualization of the parallel plate flow problem.

After some assumptions, the following PDE may be obtained to describe the fluid flow:

This linear PDE is the result of simplifying the Navier-Stokes equations, a large nonlinear PDE system which describes fluid flow. u is the velocity of the fluid in the x direction, ρ is the density of the fluid, ν is the kinematic viscosity (metaphorically speaking, how hard the molecules of the fluid rub against each other), and Px is the pressure gradient or pressure gradient vector field. Note that u = u(y, t), there is no dependence on x. In other words, the state of the fluid upstream is no different from the state downstream. Notice that we do not consider turbulences and that the state of the flow varies between the upper plate and the lower plate as the speed u is 0 close to the plates.

u(y, t) is a velocity profile. Fluid mechanics typically is concerned with velocity fields, contrary to rigid body mechanics in which the position of an object is what is important. In other words, with a rigid object all 'points' of the object move at the same speed. With a fluid we get a velocity field and each point is moving at its own speed.

The ratio Px/ρ describes the driving force; it's a pressure change (gradient) along the x direction. If Px is negative, then the pressure downstream (positive x) is smaller than the pressure upstream (negative x) and the fluid will flow left to right, i.e., u(y, t) will generally be positive.

Now on to create a specific problem: let's say that a constant negative pressure gradient was applied for a long time, until the velocity profile was steady (steady means "not changing with time"). Then the pressure gradient is suddenly removed, and without this driving force the fluid will slow down and stop. That is the assumption we are going to use for our example calculations. We assume the flow to be steady and then decaying as the pressure (expressed as Px/ρ) is removed. Hence we can remove the term -Px/ρ from our model as well, see (PDE) below.

Initial flow profile.

Let's say that before the pressure was removed, the velocity profile was u(y, t) = sin(π y). With velocity profile we mean, as the velocities are measured on a cross section of the flow, they form a hump. This would make sense: the friction dictates less motion near the plates (see next paragraph), so we could expect a maximum velocity near the centerline (y = 1/2). This assumed profile isn't really correct, but will serve as an example for now. It's graphed at right in the domain of interest.

Before getting into the math, one more thing is needed: boundary conditions. In this case, the BC is called the no slip condition, which states that the velocity of a fluid at a wall (boundary) is equal to the velocity of the wall. If we weren't making this assumption, the fluid would be moving like a rigid object. Since the velocities of the walls (or plates) in this problem are both zero, the velocity of the fluid must be zero at these two boundaries. The BCs are then u(0, t) = 0 (bottom plate) and u(1, t) = 0 (top plate).

The IBVP is:

Notice that the initial condition IC is the same hump from the first section of this book, just turned sideways. This IC implies that the flow is already flowing when we start our calculations. We are kind of calculating the decay of the flow speed.

Separation edit

Variables are separated the following way: we assume that , where Y and T are (unknown) functions respectively of y and t. This form is substituted into the PDE:

Using yields:

Look carefully at the last equation: the left side of the equation depends strictly on t, and the right side strictly on y, and they are equal. t may be varied independently of y and they'd still be equal, and y may be varied independently of t and they'd still be equal. This can only happen if both sides are constant. This may be shown as follows:

Taking the derivative of both sides, turns the right-hand-side into a constant, 0. The left-hand-side is left as is. Then taking the integrals of both sides yields:

Integration of the ordinary derivative recovers the left side but leaves the right side a constant. It follows by similarity that Y''/Y is a constant as well.

The constant in question is called the separation constant. We could simply give it a letter, such as A, but a good choice of the constant will make work easier later. In this case the best choice is -k2. This will be justified later (but it should be reemphasized that it may be notated any way you want, assuming it can span the domain).

The variables are now separated. The last two equations are two ODEs which may be solved independently (in fact, the Y equation is an eigenvalue problem), though they both contain an unknown constant. Note that ν was kept for the T equation. This choice makes the solution slightly easier, but is again completely arbitrary.

Rearrange and note that the notion buried in the expression below is that a function is equal to its own derivative, which kind of strikes the Euler number bell.

Then solve.

Here is a solution plucked from Ted Woollett's 'Maxima by Example', Chapter 3, '3.2.3 Exact Solution Using desolve'. Maxima's output is not shown but should be easily reconcilable with what has been written.

(%i1) de:(-%k^2*v*T(t))-('diff(T(t),t)) = 0;
(%i2) gsoln:desolve(de,T(t));

The same goes for our right-hand-side.

And solve.

In Maxima the solution goes like this. When Maxima is asking for 'zero' or 'nonzero' then type 'nonzero;':

(%i1)de:(-%k^2*Y(y))-('diff(Y(y),y,2));
(%i2)atvalue('diff(Y(y),y), y=0, Y(0)*%k )$
(%i3)gsoln:desolve(de,Y(y));

The overall solution will be , still with unknown constants, and as of now the product of Y and T. So we are plugging the partial solutions for and back in:

Note that C1 has been multiplied into C2 and C3, reducing the number of arbitrary constants. Because an unknown constant multiplied by an unknown constant yields still an unknown constant.

The IC or BCs should now be applied. If the IC was applied first, coefficients would be equated and all of the constants would be determined. However, the BCs may or may not have been fulfilled (in this case they would, but you're not generally so lucky). So to be safe, the BCs will be applied first:

So A being zero eliminates the term.

If we took B = 0, the solution would have just been u(y, t) = 0 (often called the trivial solution), which would satisfy the BCs and the PDE but couldn't possibly satisfy the IC. So, we take k = instead, where n is any integer. After applying the BCs we have:

Decaying flow.

Then we need to apply the IC to it. Per the IC from above is:

Per the BC is, see above. Setting those two definitions of equal is:

Since t = 0 as per the IC assumption, is becoming . The equality can only hold if B = 1 and n = 1, allowing us to simply remove those two constants from the function in its BC incarnation. That's it! The complete solution is:

It's worth verifying that the IC, BCs, and PDE are all satisfied by this. Also notice that the solution is a product of a function of t and a function of y. The graph at the right is illustrating this. Observe that the profile is plotted for different values of νt, rather than specifying some ν and graphing different values for t. Remember that v is the kinematic viscosity. Hence the decay of flow speed also depends on it. Looking at the solution, t and ν appear only once and they're multiplying, so it's natural to do this. A dimensionless time could have been introduced from the beginning.

So what happens? The fluid starts with its initial profile and slows down exponentially. Note that with x replaced with y and t replaced with νt, this is exactly the same as the result of heat flow in a rod as shown in the introduction. This is not a coincidence: the PDE for the rod describes diffusion of heat, the PDE for the parallel plates describes diffusion of momentum.

Take a second look at the separation constant, -k2. The square is convenient, without it the solution for Y(y) would have involved square roots. Without the negative sign, the solution would have involved exponentials instead of sinusoids, so the constant would have come out imaginary.

The assumption that u(y, t) = Y(y)T(t) is justified by the physics of the problem: it would make sense that the profile would keep its general shape (due to Y(y)), only it'd get flattened over time as the fluid slows down (due to T(t)).

Quickly recap what we did. First we took a model PDE from Navier-Stokes and simplified it by making an assumption about the pressure being removed. Then we sort of first-stage solved it by separating Y and T. Then we applied the BC (boundary conditions) and the IC (initial condition) yielding our final solution.


Parallel Plate Flow: Realistic IC

Parallel Plate Flow: Realistic IC edit

The Steady State edit

The initial velocity profile chosen in the last problem agreed with intuition but honestly came out of thin air. A more realistic development follows.

The problem stated that (to come up with an IC) the fluid was under a pressure difference for some time, so that the flow became steady aka flowing steadily. "Steady" is another way of saying "not changing with time", and "not changing with time" is another way of saying that:

Putting this into the PDE from the previous section:

Independent of , the PDE became an ODE with variables separated and thus we can integrate.

The no slip condition results in the following BCs: at and . We can plug the BC values into the integrated ODE and resolve the Cs.

Inserting the Cs and and simplifying yields:

For the sake of example, take (recall that a negative pressure gradient causes left to right flow). Also note that this is a constant gradient or slope. This gives a parabola which starts at , increases to a maximum of at , and returns to at .

This parabola looks pretty much identical to the sinusoid previously used (you must zoom in to see a difference). However, even more so on the narrow domain of interest, the two are very different functions (look at their taylor expansions, for example). Using the parabola instead of the sine function results in a much more involved solution.

So this derives the steady state flow, which we will use as an improved, realistic IC. Recall that the problem is about a fluid that's initially in motion that is coming to a stop due to the absence of a driving force. The IBVP (Initial Boundary Value Problem) is now subtly different:

Separation edit

Since the only difference from the problem in the last section is the IC, the variables may be separated and the BCs applied with no difference, giving:

But now we're stuck (after applying the BCs)! Applying the IC makes the term go away as t = 0, which is the IC. However, then the IC function can't be made to match:

What went wrong? It was the assumption that . The fact that the IC couldn't be fulfilled means that the assumption was wrong. It should be apparent now why the IC was chosen to be in the previous section.

We can proceed however, thanks to the linearity of the problem. Another detour is necessary, it gets long.

Linearity (the superposition principle specifically) says that if is a solution to the BVP (not the whole IBVP, only the BVP, Boundary Value Problem, the BCs applied) and so is another , then a linear combination, , is also a solution.

Let's take a step back and suppose that the IC was

This is no longer a realistic flow problem but it contains the first two terms of what is called a Fourier sine expansion, see these examples of Fourier sine expansions. We are going to generalize this below. Let's now use this expression and equate it to the half way solution (BCs applied) with being eliminated as t = 0:

And it still can't match. However, observe that the individual terms in the IC can. We simply set the constants to values making both sides match:

Note the subscripts are used to identify each term: they reflect the integer from the separation constant. Solutions may be obtained for each individual term of the IC, identified with :

        
        

Linearity states that the sum of these two solutions is also a solution to the BVP (no need for new constants):

So we added the solutions and got a new solution... what is this good for? Try setting :

Each component solution satisfies the BVP, and the sum of these just happened to satisfy our surrogate IC. The IBVP with IC is now solved. It would work the same way for any linear combination of sine functions whose half frequencies are . "Linear combination" means a sum of terms, each multiplied by a constant. The sum is assumed to converge and be term by term differentiable.

Let's do what we just did in a more generalized fashion. First, we make our IC a linear combination of sines (with eliminated as t = 0), in fact, infinitely many of them. But each successive term has to 'converge', it can't stray wildly all over the place.

Second, find the n and B for each term assuming t = 0 (the IC), then plug them back into each term making no assumptions about t, leaving t as is.

Third, sum up all the terms with their individual n and Bs.

Fourth, plug t = 0 into the sum of terms and recover the IC from the first step.

So we went full circle on this example but found the n and Bs because we were able to equate/satisfy each term with the IC. Now we can solve the problem if the IC is a linear combination of sine functions. But the IC for this problem isn't such a sum, it's just a stupid parabola. Or is it?

Series Construction edit

In the 19th century, a man named Joseph Fourier took a break from helping Napoleon take over the world to ask an important question while studying this same BVP (concerning heat flow): can a function be expressed as a sum of sinusoids, similar to a taylor series? The short answer is yes, if a few reasonable conditions apply as we have already indicated. The long answer follows, and this section is a longer answer.

A function meeting certain criteria may indeed be expanded into a sum of sines, cosines, or both. In our case, all that is needed to accomplish this expansion is to find the coefficients . A little trick involving an integral makes this possible.

The sine function has a very important property called orthogonality. There are many flavors of this, which will be served in the next chapter. Relevant to this problem is the following:

A quick hint may help. Orthogonality literally means two lines at a right angle to each other. These lines could be vectors, each with its own tuple of coordinates. If those two vectors are at a right angle to each other, multiplying and summing their coordinate tuples always yields zero (in Euclidean space). The method of multiplying and summing is also used to determine whether two functions are orthogonal. Using this definition, our multiplied and integrated functions above are orthogonal most of the time, but not always.

Let's call the IC to generalize it. We equate the IC with its expansion, meaning the linear combination of sines, and then apply some craftiness. And remember that our goal is to reproduce a parabolic function from linearly combined sines:

In the last step, all of the terms in the sum became except for the term where , the only case where we get for the otherwise orthogonal sine functions. This isolates and explicitly defines which is the same as as m = n. The expansion for is then:

Or equivalently:

Many important details have been left out for later in a devoted chapter; one noteworthy detail is that this expansion is only approximating the parabola (very superficially) on the interval , not say from to .

This expansion may finally be combined with the sum of sines solution to the BVP developed previously. Note that the last equation looks very similar to . Following from this:

So the expansion will satisfy the IC given as (surprised?). The full solution for the problem with arbitrary IC is then:

In this problem specifically, the IC is , so:

Sines and cosines appear from the integration dependent only on . Since is an integer, these can be made more aesthetic.

Note that for even , . Putting everything together finally completes the solution to the IBVP:

There are many interesting things to observe. To begin with, is not a product of a function of and a function of . Such a solution was assumed in the beginning, proved to be wrong, but eventually happened to yield a solution anyway thanks to linearity and what is called a Fourier sine expansion.

A careful look at the procedure reveals something that may be disturbing: this lengthy solution is strictly valid for the given BCs. Thanks to the definition of , the solution is generic as far as the IC is concerned (the IC doesn't even need to match the BCs), however a slight change in either BC would mandate starting over almost from the beginning.

The parabolic IC, which looks very similar to the sine function used in the previous section, is wholly to blame (or thank once you understand the beauty of a Fourier series!) for the infinite sum. It is interesting to approximate the first several numeric values of the sequence :

Recall that the even terms are all . The first term by far dominates, this makes sense since the first term already looks very, very similar to the parabola. Recall that appears in an exponential, making the higher terms even smaller for time not too close to .


Change of Variables

Change of Variables edit

As with ODEs (ordinary differential equations), a PDE (partial differential equation, or more accurately, the initial-boundary value problem (IBVP) as a whole) may be made more amenable with the help of some kind of modification of variables. So far, we've dealt only with boundary conditions that specify the value of u, which represented fluid velocity, as zero at the boundaries. Though fluid mechanics can get more complicated than that (understatement of the millennium), let's look at heat transfer now for the sake of variety.

As hinted previously, the one dimensional diffusion equation can also describe heat flow in one dimension. Think of how heat could flow in one dimension: one possibility is a rod that's completely laterally insulated, so that the heat will flow only along the rod and not across it (be aware, though, it is possible to consider heat loss/gain along the rod without going two dimensional).

If this rod has finite length, heat could flow in and out of the uninsulated ends. A 1D rod can have at most two ends (it can also have one or zero: the rod could be modeled as "very long"), and the boundary conditions could specify what happens at these ends. For example, the temperature could be specified at a boundary, or maybe the flow of heat, or maybe some combination of the two.

The equation for heat flow is usually given as:

Which is the same as the equation for parallel plate flow, only with ν replaced with α and y replaced with x.

Fixed Temperatures at Boundaries edit

Let's consider a rod of length 1, with temperatures specified (fixed) at the boundaries. The IBVP is:

φ(x) is the temperature at t = 0. Look at what the BCs say: For all time, the temperature at x = 0 is u0 and at x = 1 is u1. Note that this could be just as well a parallel plate problem: u0 and u1 would represent wall velocities.

The PDE is easily separable, in basically the same way as in previous chapters:

Now, substitute the BCs:

We can't proceed. Among other things, the presence of t in the exponential factor (previously divided out) prevents anything from coming out of this.

This is another example of the fact that the assumption that u(x, t) = X(x)T(t) was wrong. The only thing that prevents us from getting a solution would be the non-zero BCs. This is where changing variables will help: a new variable v(x, t) will be defined in terms of u which will be separable.

Think of how v(x, t) could be defined to make its BCs zero ("homogeneous"). One way would be:

This form is inspired from the appearance of the BCs, and it can be readily seen:

If h(0) = u0 and h(1) = u1, v(x, t) would indeed have zero BCs. Pretty much any choice of h(x) satisfying these conditions would do it, but only one is the best choice. Making the substitution into the PDE:

So now the PDE has been messed up by the new term involving h. This will thwart separation...

...unless that last term happens to be zero. Rather then hoping it's zero, we can demand it (the best choice hinted above), and put the other requirements on h(x) next to that:

Note that the partial derivative became an ordinary derivative since h is a function of x only. The above constitutes a pretty simple boundary value problem, with unique solution:

It's just a straight line. Note that this is what would arise if the steady state (time independent) problem were solved for u(x). In other words, h could've been pulled out of one's ass readily just looking at the physics of the situation.

The problem now reduces to finding v(x, t). The IBVP for this would be:

Note that the IC changed under the transformation. The solution to this IBVP was found in a past chapter through separation of variables and superposition to be:

u(x, t) may now be found simply by adding h(x), according to how the variable change was defined:

This solution looks like the sum of a steady state portion (that's h(x)) and a transient portion (that's v(x)):

Visualization of the change of variables.

Time Varying Temperatures at Boundaries edit

Note that this wouldn't work so nicely with non-constant BCs. For example, if the IBVP were:

Then, transforming it would require h = h(x, t). Reusing u(x, t) = v(x, t) + h(x, t) introduced previously would eventually lead to:

Where, to simplify the PDE above:

Which doesn't really make anything simpler, despite freedom in the choice of IC.

But this isn't completely useless. Note that the PDE for h was chosen to simplify the PDE for v(x, t) (would lead to the terms involving h to cancel out), which may lead to the question: Was this necessary?

The answer is no. If that were the case, the PDE we picked for h would not be satisfied, and that would result in extra terms in the PDE for v(x, t). The no-longer-separable IBVP for v(x, t) could, however, be solved via an eigenfunction expansion, whose full story will be told later. It's worth noting though, that an eigenfunction expansion would require homogenous BCs, so the transformation was necessary.

So this problem has to be put aside without any conclusion for now. I told you that BCs can mess everything up.

Pressure Driven Transient Parallel Plate Flow edit

Now back to fluid mechanics. Previously, we dealt with flow that was initially moving but slowing down because of resistance and the absence of a driving force. Maybe, it'd be more interesting if we had a fluid initially at rest (ie, zero IC) but set into motion by some constant pressure difference. The IBVP for such a case would be:

This PDE with the pressure term was described previously. That pressure term is what drives the flow; it is assumed constant.

The intent of the change of variables would be to remove the pressure term from the PDE (which prevents separation) while keeping the BCs homogeneous.

One path to take would be to add something to u(x, t), either a function of t or a function of y, so that differentiation would leave behind a constant that could cancel the pressure term out. Adding a function of t would be very unfavorable since it'd result in time dependent BCs, so let's try a function of y:

Substituting this into the PDE:

This procedure will simplify the PDE and preserve the BCs only if the following conditions hold:

The first condition, an ODE, is required to simplify the PDE for v(y, t), it will result in cancellation of the last two terms. The other two conditions are chosen to preserve the homogeneous BCs of the problem (note that if the BCs of u(y, t) weren't homogeneous, the BCs on f(y) would need to be picked to amend that).

The solution to the BVP above is simply:

So f(y) was successfully determined. Note that this function is symmetric about y = 1/2. The IBVP for v(y, t) becomes:

This is the same IBVP we've been beating to death for some time now. The solution for v(y, t) is:

And the solution for u(y, t) follows from how the variable change was defined:

This solution fits what we expect: it starts flat and approaches the parabolic profile quickly. This is the same parabola derived as the steady state flow in the realistic IC chapter; the integral was evaluated for integer n, simplifying it.

A careful look at the solution reveals something interesting: this is just decaying parallel plate flow "in reverse". Instead of the flow starting parabolic and gradually approaching u = 0, it starts with u = 0 and gradually approaches a parabola.

Time Dependent Diffusivity edit

In this example we'll change time, an independent variable, instead of changing the dependent variable. Consider the following IBVP:

Note that this is a separable; a transformation isn't really necessary, however it'll be easier since we can reuse past solutions if it can be transformed into something familiar.

Let's not get involved with the physics of this and just call it a diffusion problem. It could be diffusion of momentum (as in fluid mechanics), diffusion of heat (heat transfer), diffusion of a chemical (chemistry), or simply a mathematician's toy. In other words, a confession: it was purposely made up to serve as an example.

The (time dependent) factor in front of the second derivative is called the diffusivity. Previously, it was a constant α (called "thermal diffusivity") or constant ν ("kinematic viscosity"). Now, it decays with time.

To simplify the PDE via a transformation, we look for ways in which the factor could cancel out. One way would be to define a new time variable, call it τ and leave it's relation to t arbitrary. The chain rule yields:

Substituting this into the PDE:

Note now that the variable t will completely disappear (divide out in this case) from the equation if:

C is completely arbitrary. However, the best choice of C is the one that makes τ = 0 when t = 0, since this wouldn't change the IC which is defined at t = 0; so, take C = 0. Note that the BCs wouldn't change either way, unless they were time dependent, in which they would change no matter what C is chosen. The IBVP is turned into:

Digging up the solution and restoring the original variable:

Note that, unlike any of the previous examples, the physics of the problem (if there were any) couldn't have helped us. It's also worth mentioning that the solution doesn't limit to u = 0 for long time.

Concluding Remarks edit

Changing variables works a little differently for PDEs in the sense that you have a lot of freedom thanks to partial differentiation. In this chapter, we picked what seemed to be a good general form for the transformation (inspired by whatever prevented easy solution), wrote down a bunch of requirements, and defined the transformation to uniquely satisfy the requirements. Doing the same for ODEs can often degrade to a monkey with typewriter situation.

Many simple little changes go without saying. For example, we've so far worked with rods of length "1" or plates separated by a distance of "1". What if the rod was 5 m long? Then space would have to be nondimensionalized using the following transformation:

Simple nondimensionalization is, well, simple; however for PDEs with more terms it can lead to scale analysis which can lead to perturbation theory which will all have to be explained in a later chapter.

It's worth noting that the physics of the IBVP very often suggest what kind of transformation needs to be done. Even some nonlinear problems can be solved this way.

This topic isn't nearly over, changes of variables will be dealt with again in future chapters.


The Laplacian and Laplace's Equation

The Laplacian and Laplace's Equation edit

By now, you've most likely grown sick of the one dimensional transient diffusion PDE we've been playing with:

Make no mistake: we're not nearly done with this stupid thing; but for the sake of variety let's introduce a fresh new equation and, even though it's not strictly a separation of variables concept, a really cool quantity called the Laplacian. You'll like this chapter; it has many pretty pictures in it.

Graph of .

The Laplacian edit

The Laplacian is a linear operator in Euclidean n-space. There are other spaces with properties different from Euclidean space. Note also that operator here has a very specific meaning. As a function is sort of an operator on real numbers, our operator is an operator on functions, not on the real numbers. See here for a longer explanation.

We'll start with the 3D Cartesian "version". Let . The Laplacian of the function is defined and notated as:

So the operator is taking the sum of the nonmixed second derivatives of with respect to the Cartesian space variables . The "del squared" notation is preferred since the capital delta can be confused with increments and differences, and is too long and doesn't involve pretty math symbols. The Laplacian is also known as the Laplace operator or Laplace's operator, not to be confused with the Laplace transform. Also, note that if we had only taken the first partial derivatives of the function , and put them into a vector, that would have been the gradient of the function . The Laplacian takes the second unmixed derivatives and adds them up.

In one dimension, recall that the second derivative measures concavity. Suppose ; if is positive, is concave up, and if is negative, is concave down, see the graph below with the straight up or down arrows at various points of the curve. The Laplacian may be thought of as a generalization of the concavity concept to multivariate functions.

This idea is demonstrated at the right, in one dimension: . To the left of , the Laplacian (simply the second derivative here) is negative, and the graph is concave down. At , the curve inflects and the Laplacian is . To the right of , the Laplacian is positive and the graph is concave up.

Concavity may or may not do it for you. Thankfully, there's another very important view of the Laplacian, with deep implications for any equation it shows itself in: the Laplacian compares the value of at some point in space to the average of the values of in the neighborhood of the same point. The three cases are:

  • If is greater at some point than the average of its neighbors, .
  • If is at some point equal to the average of its neighbors, .
  • If is smaller at some point than the average of its neighbors, .

So the laplacian may be thought of as, at some point :

The neighborhood of .

The neighborhood of some point is defined as the open set that lies within some Euclidean distance δ (delta) from the point. Referring to the picture at right (a 3D example), the neighborhood of the point is the shaded region which satisfies:

Note that our one dimensional transient diffusion equation, our parallel plate flow, involves the Laplacian:

With this mentality, let's examine the behavior of this very important PDE. On the left is the time derivative and on the right is the Laplacian. This equation is saying that:

The rate of change of at some point is proportional to the difference between the average value of around that point and the value of at that point.

For example, if there's at some position a "hot spot" where is on average greater then its neighbors, the Laplacian will be negative and thus the time derivative will be negative, this will cause to decrease at that position, "cooling" it down. This is illustrated below. The arrows reflect upon the magnitude of the Laplacian and, by grace of the time derivative, the direction the curve will move.

Visualization of transient diffusion.

It's worth noting that in 3D, this equation fully describes the flow of heat in a homogeneous solid that's not generating it's own heat (like too much electricity through a narrow wire would).

Laplace's Equation edit

Laplace's equation describes a steady state condition, and this is what it looks like:

Solutions of this equation are called harmonic functions. Some things to note:

  • Time is absent. This equation describes a steady state condition.
  • The absence of time implies the absence of an IC, so we'll be dealing with BVPs rather then IBVPs.
  • In one dimension, this is the ODE of a straight line passing through the boundaries at their specified values.
  • All functions that satisfy this equation in some domain are analytic (informally, an analytic function is equal to its Taylor expansion) in that domain.
  • Despite appearances, solutions of Laplace's equation are generally not minimal surfaces.
  • Laplace's equation is linear.

Laplace's equation is separable in the Cartesian (and almost any other) coordinate system. So, we shouldn't have too much problem solving it if the BCs involved aren't too convoluted.

Laplace's Equation on a Square: Cartesian Coordinates edit

Steady state conditions on a square.

Imagine a 1 x 1 square plate that's insulated top and bottom and has constant temperatures applied at its uninsulated edges, visualized to the right. Heat is flowing in and out of this thing steadily through the edges only, and since it's "thin" and "insulated", the temperature may be given as . This is the first time we venture into two spatial coordinates, note the absence of time.

Let's make up a BVP, referring to the picture:

So we have one nonhomogeneous BC. Assume that :

As with before, calling the separation constant in favor of just (or something) happens to make the problem easier to solve. Note that the negative sign was kept for the equation: again, these choices happen to make things simpler. Solving each equation and combining them back into :

At edge D:

Note that the constants can be merged, but we won't do it so that a point can be made in a moment. At edge A:

Taking as would satisfy this particular BC, however this would yield a plane solution of , which can't satisfy the temperature at edge C. This is why the constants weren't merged a few steps ago, to make it obvious that may not be . So, we instead take to satisfy the above, and then combine the three constants into one, call it :

Now look at edge B:

It should go without saying by now that can't be zero, since this would yield which couldn't satisfy the nonzero BC. Instead, we can take :

As of now, this solution will satisfy 3 of the 4 BCs. All that is left is edge C, the nonhomogeneous BC.

Neither nor can be contorted to fit this BC.

Since Laplace's equation is linear, a linear combination of solutions to the PDE is also a solution to the PDE. Another thing to note: since the BCs (so far) are homogeneous, we can add the solutions without worrying about nonzero boundaries adding up.

Though as shown above will not solve this problem, we can try summing (based on ) solutions to form a linear combination which might solve the BVP as a whole:

Assuming this form is correct (review Parallel Plate Flow: Realistic IC for motivation), let's again try applying the last BC:

It looks like it needs Fourier series methodology. Finding via orthogonality should solve this problem:

25 term partial sum of the series solution.

was changed to in the last step. Also, for integer , . Note that a Fourier sine expansion has been done. The solution to the BVP can finally be assembled:

That solves it!

It's finally time to mention that the BCs are discontinuous at the points and . As a result, the series should converge slowly at those points. This is clear from the plot at right: it's a 25 term partial sum (note that half of the terms are ), and it looks perfect except at , especially near the discontinuities at and .

Laplace's Equation on a Circle: Polar Coordinates edit

Now, we'll specify the value of on a circular boundary. A circle can be represented in Cartesian coordinates without too much trouble; however, it would result in nonlinear BCs which would render the approach useless. Instead, polar coordinates should be used, since in such a system the equation of a circle is very simple. In order for this to be realized, a polar representation of the Laplacian is necessary. Without going in to the details just yet, the Laplacian is given in (2D) polar coordinates:

This result may be derived using differentials and the chain rule; it's not difficult but it's a little long. In these coordinates Laplace's equation reads:

Note that in going from Cartesian to polar coordinates, a price was paid: though still linear, Laplace's equation now has variable coefficients. This implies that after separation at least one of the ODEs will have variable coefficients as well.

Let's make up the following BVP, letting :

This could represent a physical problem analogous to the previous one: replace the square plate with a disc. Note the apparent absence of sufficient BC to obtain a unique solution. The funny looking statement that u is bounded inside the domain of interest turns out to be the key to getting a unique solution, and it often shows itself in polar coordinates. It "makes up" for the "lack" of BCs. To separate, we as usual incorrectly assume that :

Once again, the way the negative sign and the separation constant are arranged makes the solution easier later on. These decisions are made mostly by trial and error.

The equation is probably one you've never seen before, it's a special case of the Euler differential equation (not to be confused with the Euler-Lagrange differential equation). There are a couple of ways to solve it, the most general method would be to change the variables so that an equation with constant coefficients is obtained. An easier way would be to note the pattern in the order of the coefficients and the order of the derivatives, and from there guess a power solution. Either way, the general solution to this simple case of Euler's ODE is given as:

This is a very good example problem since it goes to show that PDE problems very often turn into obscure ODE problems; we got lucky this time since the solution for was rather simple though its ODE looked pretty bad at first sight. The solution to the equation is:

Combining:

Now, this is where the English sentence condition stating that u must be bounded in the domain of interest may be invoked. As , the term involving is unbounded. The only way to fix this is to take . Note that if this problem were solved between two concentric circles, this term would be nonzero and very important. With that term gone, constants can be merged:

Only one condition remains: on , yet there are 3 constants. Let's say for now that:

Then, it's a simple matter of equating coefficients to obtain:

Now, let's make the frequencies differ:

Equating coefficients won't work. However, if the IC were broken up into individual terms, the sum of the solution to the terms just happens to solve the BVP as a whole:

Verify that the solution above is really equal to the BC at :

And, since Laplace's equation is linear, this must solve the PDE as well. What all of this implies is that, if some generic function may be expressed as a sum of sinusoids with angular frequencies given by , all that is needed is a linear combination of the appropriate sum. Notated:

To identify the coefficients, substitute the BC:

The coefficients and may be determined by a (full) Fourier expansion on . Note that it's implied that must have period since we are solving this in a domain (a circle specifically) where .

You probably don't like infinite series solutions. Well, it happens that through a variety of manipulations it's possible to express the full solution of this particular problem as:

This is called Poisson's integral formula.

Derivation of the Laplacian in Polar Coordinates edit

Though not necessarily a PDEs concept, it is very important for anyone studying this kind of math to be comfortable with going from one coordinate system to the next. What follows is a long derivation of the Laplacian in 2D polar coordinates using the multivariable chain rule and the concept of differentials. Know, however, that there are really many ways to do this.

Three definitions are all we need to begin:

If it's known that , then the chain rule may be used to express derivatives in terms of and alone. Two applications will be necessary to obtain the second derivatives. Manipulating operators as if they meant something on their own:

Applying this to itself, treating the underlined bit as a unit dependent on and :

The above mess may be quickly simplified a little by manipulating the funny looking derivatives:

This may be made slightly easier to work with if a few changes are made to the way some of the derivatives are written. Also, the variable follows analogously:

Now we need to obtain expressions for some of the derivatives appearing above. The most direct path would use the concept of differentials. If:

Then:

Solving by substitution for and gives:

If , then the total differential is given as:

Note that the two previous equations are of this form (recall that and , just like above), which means that:

Equating coefficients quickly yields a bunch of derivatives:

There's an easier but more abstract way to obtain the derivatives above that may be overkill but is worth mentioning anyway. The Jacobian of the functions and is:

Note that the Jacobian is a compact representation of the coefficients of the total derivative; using as an example (bold indicating vectors):

So, it follows then that the derivatives that we're interested in may be obtained by inverting the Jacobian matrix:

Though somewhat obscure, this is very convenient and it's just one of the many utilities of the Jacobian matrix. An interesting bit of insight is gained: coordinate changes are senseless unless the Jacobian is invertible everywhere except at isolated points, stated another way the determinant of the Jacobian matrix must be nonzero, otherwise the coordinate change is not one-to-one (note that the determinant will be zero at in this example. An isolated point such as this is not problematic.).

Either path you take, there should now be enough information to evaluate the Cartesian second derivatives. Working on :

Proceeding similarly for :

Now, add these tirelessly hand crafted differential operators and watch the result collapse into just 3 nontrigonometric terms:

That was a lot of work. To save trouble, here is the Laplacian in other two other popular coordinate systems:

Cylindrical:

Spherical:

Derivatives have been combined wherever possible (not done previously).

Concluding Remarks edit

This was a long, involved chapter. It should be clear that the solutions derived work only for very simple geometries, other geometries may be worked with by grace of conformal mappings.

The Laplacian (and variations of it) is a very important quantity and its behaviour is worth knowing like the back of your hand. A sampling of important equations that involve the Laplacian:

  • The Navier Stokes equations.
  • The diffusion equation.
  • Laplace's equation.
  • Poisson's equation.
  • The Helmholtz equation.
  • The Schrödinger equation.
  • The wave equation.

There's a couple of other operators that are similar to (though less important than) the Laplacian, which deserve mention:

  • Biharmonic operator, in three Cartesian dimensions:

The biharmonic equation is useful in linear elastic theory, for example it can describe "creeping" fluid flow:

  • d'Alembertian:

The wave equation may be expressed using the d'Alembertian:

Though expressed with the Laplacian is more popular:


Introduction and Classifications

Introduction and Classifications edit

The intent of the prior chapters was to provide a shallow introduction to PDEs and their solution without scaring anyone away. A lot of fundamentals and very important details were left out. After this point, we are going to proceed with a little more rigor; however, knowledge past one undergraduate ODE class alongside some set theory and countless hours on Wikipedia should be enough.

Some Definitions and Results edit

An equation of the form

is called a partial differential equation if is unknown and the function involves partial differentiation. More concisely, is an operator or a map which results in (among other things) the partial differentiation of . is called the dependent variable, the choice of this letter is common in this context. Examples of partial differential equations (referring to the definition above):

Note that what exactly is made of is unspecified, it could be a function, several functions bundled into a vector, or something else; but if satisfies the partial differential equation, it is called a solution.

Another thing to observe is seeming redundancy of , its utility draws from the study of linear equations. If , the equation is called homogeneous, otherwise it's nonhomogeneous or inhomogeneous.

It's worth mentioning now that the terms "function", "operator", and "map" are loosely interchangeable, and that functions can involve differentiation, or any operation. This text will favor, not exclusively, the term function.

The order of a PDE is the order of the highest derivative appearing, but often distinction is made between variables. For example the equation

is second order in and fourth order in (fourth derivatives will result regardless of the form of ).

Linear Partial Differential Equations edit

Suppose that , and that satisfies the following properties:

for any scalar . The first property is called additivity, and the second one is called homogeneity. If is additive and homogeneous, it is called a linear function, additionally if it involves partial differentiation and

then the equation above is a linear partial differential equation. This is where the importance of shows up. Consider the equation

where is not a function of . Now, if we represent the equation through

then fails both additivity and homogeneity and so is nonlinear (Note: the equation defining the condition is 'homogeneous', but in a distinct usage of the term). If instead

then is now linear. Note then that the choice of and is generally not unique, but if an equation could be written in a linear form it is called a linear equation.

Linear equations are very popular. One of the reasons for this popularity is a little piece of magic called the superposition principle. Suppose that both and are solutions of a linear, homogeneous equation (here onwards, will denote a linear function), ie

for the same . We can feed a combination of and into the PDE and, recalling the definition of a linear function, see that

for some constants and . As stated previously, both and are solutions, which means that

What all this means is that if both and solve the linear and homogeneous equation , then the quantity is also a solution of the partial differential equation. The quantity is called a linear combination of and . The result would hold for more combinations, and generally,

    The Superposition Principle

Suppose that in the equation

the function is linear. If some sequence satisfies the equation, that is if

then any linear combination of the sequence also satisfies the equation:

where is a sequence of constants and the sum is arbitrary.

Note that there is no mention of partial differentiation. Indeed, it's true for any linear equation, algebraic or integro-partial differential-whatever. Concerning nonhomogeneous equations, the rule can be extended easily. Consider the nonhomogeneous equation

Let's say that this equation is solved by and that a sequence solves the "associated homogeneous problem",

where is the same between the two. An extension of superposition is observed by, say, the specific combination :

More generally,

    The Extended Superposition Principle

Suppose that in the nonhomogeneous equation

the function is linear. Suppose that this equation is solved by some , and that the associated homogeneous problem

is solved by a sequence . That is,

Then plus any linear combination of the sequence satisfies the original (nonhomogeneous) equation:

where is a sequence of constants and the sum is arbitrary.

The possibility of combining solutions in an arbitrary linear combination is precious, as it allows the solutions of complicated problems be expressed in terms of solutions of much simpler problems.

This part of is why even modestly nonlinear equations pose such difficulties: in almost no case is there anything like a superposition principle.

Classification of Linear Equations edit

A linear second order PDE in two variables has the general form

If the capital letter coefficients are constants, the equation is called linear with constant coefficients, otherwise linear with variable coefficients, and again, if = 0 the equation is homogeneous. The letters and are used as generic independent variables, they need not represent space. Equations are further classified by their coefficients; the quantity

is called the discriminant. Equations are classified as follows:

Note that if coefficients vary, an equation can belong to one classification in one domain and another classification in another domain. Note also that all first order equations are parabolic.

Smoothness of solutions is interestingly affected by equation type: elliptic equations produce solutions that are smooth (up to the smoothness of coefficients) even if boundary values aren't, parabolic equations will cause the smoothness of solutions to increase along the low order variable, and hyperbolic equations preserve lack of smoothness.

Generalizing classifications to more variables, especially when one is always treated temporally (ie associated with ICs, but we haven't discussed such conditions yet), is not too obvious and the definitions can vary from context to context and source to source. A common way to classify is with what's called an elliptic operator.

    Definition: Elliptic Operator

A second order operator of the form

is called elliptic if , an array of coefficients for the highest order derivatives, is a positive definite symmetric matrix. is the imaginary unit. More generally, an order elliptic operator is

if the dimensional array of coefficients of the highest () derivatives is analogous to a positive definite symmetric matrix.

Not commonly, the definition is extended to include negative definite matrices.

The negative of the Laplacian, , is elliptic with . The definition for the second order case is separately provided because second order operators are by a large margin the most common.

Classifications for the equations are then given as

for some constant k. The most classic examples of these equations are obtained when the elliptic operator is the Laplacian: Laplace's equation, linear diffusion, and the wave equation are respectively elliptic, parabolic, and hyperbolic and are all defined in an arbitrary number of spatial dimensions.

Other classifications edit

Quasilinear edit

The linear form

was considered previously with the possibility of the capital letter coefficients being functions of the independent variables. If these coefficients are additionally functions of which do not produce or otherwise involve derivatives, the equation is called quasilinear. It must be emphasized that quasilinear equations are not linear, no superposition or other such blessing; however these equations receive special attention. They are better understood and are easier to examine analytically, qualitatively, and numerically than general nonlinear equations.

A common quasilinear equation that'll probably be studied for eternity is the advection equation

which describes the conservative transport (advection) of the quantity in a velocity field . The equation is quasilinear when the velocity field depends on , as it usually does. A specific example would be a traffic flow formulation which would result in

Despite resemblance, this equation is not parabolic since it is not linear. Unlike its parabolic counterparts, this equation can produce discontinuities even with continuous initial conditions.

General Nonlinear edit

Some equations defy classification because they're too abnormal. A good example of an equation is the one that defines a minimal surface expressible as :

where is the height of the surface.



Vector Spaces: Mathematic Playgrounds

Vector Spaces: Mathematic Playgrounds edit

The study of partial differential equations requires a clear definition of what kind of numbers are being dealt with and in what way. PDEs are normally studied in certain kinds of vector spaces, which have a number of properties and rules associated with them which make possible the analysis and unifies many notions.

The Real Field edit

A field is a set that is bundled with two operations on the set called addition and multiplication which obey certain rules, called axioms. The letter will be used to represent the field, and from definition a field requires the following ( and are in ):

  • Closure under addition and multiplication: the addition and multiplication of field members produces members of the same field.
  • Addition and multiplication are associative: and .
  • Addition and multiplication are commutative: and .
  • Addition and multiplication are distributive: and .
  • Existence of additive identity: there is an element in notated 0, sometimes called the sum of no numbers, such that .
  • Existence of multiplicative identity: there is an element in notated 1 different from 0, sometimes called the product of no numbers, such that .
  • Existence of additive inverse: there is an element in associated with notated such that .
  • Existence of multiplicative inverse: there is an element in associated with (if is nonzero), notated such that .

These are called the field axioms. The field that we deal with, by far the most common one, is the real field. The set associated with the real field is the set of real numbers, and addition and multiplication are the familiar operations that everyone knows about.

Another example of a set that can form a field is the set of rational numbers, numbers which are expressible as the ratio of two integers. An example of a common set that doesn't form a field is the set of integers: there generally is no multiplicative inverse since the reciprocal of an integer generally is not an integer.

Note that when we say that an object is in , what is meant is that the object is a member of the set associated in the field and that it complies with the field axioms.

The Vector edit

Most non-mathematics students are taught that vectors are ordered groups ("tuples") of quantities. This is not complete, vectors are a lot more general than that. Informally, a vector is defined as an object that can be scaled and added with other vectors. This will be made more specific soon.

Examples of vectors:

  • The real numbers.
  • Pairs, triples, etc of real numbers.
  • Polynomials.
  • Most functions.

Examples of objects that are not vectors:

  • Members of the extended real numbers. Specifically, the infinity and negative infinity elements neither scale nor add.
  • The integers, at least when scaled by real numbers (since the result will not necessarily be an integer).

An interesting (read: confusing) fact to note is that, by the definition above, matrices and even tensors qualify as vectors since they can be scaled or added, even though these objects are considered generalizations of more "conventional" vectors, and calling a tensor a vector will lead to confusion.

The Vector Space edit

A vector space can be thought of as a generalization of a field.

Letting represent some field, a vector space over is a set of vectors bundled with two operations called vector addition and scalar multiplication, notated:

  • Vector addition: , where .
  • Scalar multiplication: , where and .

The members of are called vectors, and the members of the field associated with are called scalars. Note that these operations imply closure (see the first field axiom), so that it does not have to be explicitly stated. Note also that this is essentially where a vector is defined: objects that can be added and scaled. The vector space must comply with the following axioms ( and are in ; and are in ):

  • Addition is associative: .
  • Addition is commutative: .
  • Scalar multiplication is distributive over vector addition: .
  • Scalar multiplication is distributive over field addition: .
  • Scalar and field multiplication are compatible: .
  • Existence of additive identity: there is an element in notated 0 such that .
  • Existence of additive inverse: there is an element in associated with notated such that .
  • Existence of multiplicative identity: there is an element in notated 1 different from 0 such that .

An example of a vector space is one where polynomials are vectors over the real field. An example of a space that is not a vector space is one where vectors are rational numbers over the real field, since scalar multiplication can lead to vectors that are not rational (implied closure under scalar multiplication is violated).

By analogy with linear functions, vectors are linear by nature, hence a vector space is also called a linear space. The name "linear vector space" is also used, but this is somewhat redundant since there is no such thing as a nonlinear vector space. It's now worth mentioning an important quantity called a linear combination (not part of the definition of a vector space, but important):

where is a sequence of field members and is a sequence of vectors. The fact that a vector can be formed by a linear combination of other vectors is much of the essence of the vector field.

Note that a field over itself qualifies as a vector space. Fields of real numbers and other familiar objects are sometimes called spaces, since distance and other useful concepts apply.

The definition of a vector space is quite general. Note that, for example, there is no mention of any kind of product between vectors, nor is there a notion of the "length" of a vector. The vector space as defined above is very primitive, but it's a starting point: through various extensions, specific vector spaces can have a lot of nice properties and other features that make our playgrounds fun and comfortable. We'll discuss bases (plural of basis) and then take on some specific vector spaces.

The Basis edit

A nonempty subset of is called a linear subspace of if is itself a vector space. The requirement that be a vector space can be safely made specific by saying that is closed under vector addition and scalar multiplication, since the rest of the vector space properties are inherited.

The linear span of a set of vectors in may then be defined as:

Where . The span is the intersection over all choices of . This concept may be extended so that is not necessarily finite. The span of is the intersection of all of the linear subspaces of .

Now, think of what happens if a vector is removed from the set . Does the span change? Not necessarily, it may be possible that the remaining vectors in the span are sufficient to "fill in" for the missing vector through linear combination of the remaining vectors.

Let be a subset of . If the span of is the same as the span of , and if removing a vector from necessarily changes its span, then the set of vectors is called a basis of , and the vectors of are called linearly independent. It can be proven that a basis can be constructed for every vector space.

Note that the basis is not unique. This obscure definition of a basis is convenient because it is very broad, it is worth understanding fully. An important property of a vector space is that it necessarily has a basis, and that any vector in the space may be written in terms of a linear combination of the members of the basis.

A more understandable (though less elemental) explanation is provided: for a vector space over the field , the vectors form a basis of (where are in and the following are in ), satisfying the following properties:

  • The basis vectors are linearly independent: if
then
without exception.
  • The basis vectors span : for some given in , it is possible to choose so that

The basis vectors of a vector space are usually notated with as .

Euclidean n-Space edit

As most students are familiar with Euclidean n-space, this section serves more of an example than anything else.

Let be the field of real numbers, then the vector space over is defined to be the space of n-tuples of members of . In other words, more clearly:

If , then , where are the vectors of the vector space .

These vectors are called n-dimensional coordinates, and the vector space is called the real coordinate space; note that coordinates (unlike more general vectors) are often notated in boldface, or else with an arrow over the letter. They are also called spatial vectors, geometric vectors, just "vectors" if the context allows, and sometimes "points" as well, though some authors refuse to consider points as vectors, attributing a "fixed" sense to points so that points can't be added, scaled, or otherwise messed with. Part of the reason for this is that it allows one to say that some vector space is bound to a point, the point being called the origin.

The Euclidean n-space is the special real coordinate n-space with some additional structure defined which (finally) gives rise to the geometric notion of (specifically these) vectors.

To begin with, an inner product is first defined, notated with either angle braces or a dot:

This quantity, which turns two vectors into a scalar (a member of ) doesn't have a great deal of geometric meaning until some more structure is defined. In a coordinate space, the dot notation is favored, and this product is often called the "dot product", especially when or . The definition of this inner product qualifies as an inner product space.

Next comes the norm, in terms of the inner product:

The notation involving single pipes around the letter x is common, again, when or , due to analogy with absolute value, for real and especially complex numbers. For a coordinate space, the norm is often called the length of . This quickly leads to the notion of the distance between two vectors:

Which is simply the length of the vector "from" to .

Finally, the angle between and is defined through, for ,

The motivation for this definition of angle, valid for any , comes from the fact that one can prove that the literal measurable angle between two vectors in satisfies the above (the norm is motivated similarly). Discussing these 2D angles and distances of course mandates making precise the notion of a vector as an "arrow" (ie, correlating vectors to things you can draw on a sheet of paper), but that would get involved and most are already subconsciously familiar with this and it's not the point of this introduction.

This completes the definition of . A thorough introduction to Euclidean space isn't very fitting in a text on Partial Differential equations, it is included so that one can see how a familiar vector space can be constructed ground-up through extensions called "structure".

Banach Spaces edit

Banach spaces are more general than Euclidean space, and they begin our departure from vectors as geometric objects into vectors as toys in the crazy world of functional analysis.

To be terse, a Banach space is defined as any complete normed vector space. The details follow.

The Inner Product edit

The inner product is a vector operation which results in a scalar. The vectors are members of a vector space , and the scalar is a member of the field associated with . A vector space on which an inner product is defined is said to be "equipped" with an inner product, and the space is an inner product space. The inner product of and is usually notated .

A truly general definition of the inner product would be long. Normally, if the vectors are real or complex in nature (eg, complex coordinates or real valued functions), the inner product must satisfy the following axioms:

  • Distributive in the first variable: .
  • Associative in the first variable: .
  • Nondegeneracy and nonnegativity: , equality will hold only when .
  • Conjugate symmetry: .

Note that if the space is real, the last requirement (the overbar indicates complex conjugation) simplifies to , and then the first two axioms extend to the second variable.

A desirable property of an inner product is some kind of orthogonality. Two nonzero vectors are said to be orthogonal only if their inner product is zero. Remember that we're talking about vectors in general, not specifically Euclidean.

Inner products are by no means unique, good definitions are what add quality to specific spaces. The Euclidean inner product, for example, defines the Euclidean distance and angle, quantities which form the foundation of Euclidean geometry.

The Norm edit

The norm is usually, though not universally, defined in terms of the inner product, which is why the inner product was discussed first (to be technically correct, a Banach space doesn't necessarily need to have an inner product). The norm is an operation (notated with double pipes) which takes one vector and generates one scalar, necessarily satisfying the following axioms:

  • Scalability: .
  • The triangle inequality: .
  • Nonnegativity: , equality only when .

The fact that can be proven from the first two statements above.

Definition requires that only when (compare this to the inner product, which can be zero even if fed nonzero vectors); if this condition is relaxed so that is possible for nonzero vectors, the resulting operation is called a seminorm.

The distance between two vectors and is a useful quantity which is defined in terms of the norm:

The distance is often called the metric, and a vector space equipped with a distance is called a metric space.

Completeness edit

A Cauchy sequence shown in blue.
A sequence that is not Cauchy. The elements of the sequence fail to get close to each other as the sequence progresses.

As stated before, a Banach space is defined as a complete normed vector space. The norm was described above, so that all that is left to establish the definition of a Banach space is completeness.

Consider a sequence of vectors in a vector space . This sequence of vectors is called a Cauchy sequence if these vectors "tend" toward some "destination" vector, as shown in the pictures at right. Stated precisely, a sequence is a Cauchy sequence if it is always possible to make the distance arbitrarily small by picking larger values of and .

The limit of a Cauchy sequence is:

A vector space is called complete if every Cauchy sequence has a limit that is also in . A Banach space is, finally, a vector space equipped with a norm that is complete. Note that completeness implies existence of distance, which means that every Banach space is a metric space.

An example of a vector space that is complete is Euclidean n-space. An example of a vector space that isn't complete is the space of rational numbers over rational numbers: it is possible to form a sequence of rational numbers which limit to an irrational number.

Hilbert spaces edit

Note that the inner product was defined above but not subsequently used in the definition of a Banach space. Indeed, a Banach space must have a norm but doesn't necessarily need to have an inner product. However, if the norm in a Banach space is defined through the inner product by

then the resulting special Banach space is called a Hilbert space. Hilbert spaces are important in the study of partial differential equations (some relevance finally!) because many theorems and important results are valid only in Hilbert spaces.



Nondimensionalization

Nondimensionalization edit

Introduction edit

You may have noticed something possibly peculiar about all of the problems so far dealt with: "simple" numbers like or keep appearing in BCs and elsewhere. For example, we've so far dealt with BCs such as:

Is this meant to simplify this book because the author is lazy? No. Well, actually, the author is lazy, but it's really the result of what's known as nondimensionalization.

On a basic level, nondimensionalization does two things:

  • Gets all units out of the problem.
  • Makes relevant variables range from to or so.

The second point has very serious implications which will have to wait for later. We'll talk about getting units out of the problem for now: important because most natural functions don't have any meaning when fed a unit. For example, is a goofy expression which doesn't mean anything at all (consider its Taylor expansion if you're not convinced).

Do not misunderstand: you can solve any problem you like keeping units in place. That's why angular velocity has units of Hz (), so then has meaning if is in seconds (or can be made to be).

A motivation for nondimensionalization can be seen by noting that ratios of variables to dimensions ("dimension" includes both a size and a unit) have a tendency to show up again and again. Examine what happens if steady state parallel plate flow (an ODE) with the walls separated by , not , is solved:

Let's keep the dimensions of and unspecified for now. Solving this BVP:

Note that we have showing up. Not a coincidence; this implies that the dimensionless problem (or at least halfway dimensionless. We haven't discussed the dimensions of ) could be setup by altering the variable:

is the normalized version of , it ranges from to where varies from to . It is said that is scaled by .

This new variable may be substituted into the problem:

Since the new variable contains no unit, it should be obvious that the rational coefficient must have units of velocity if is to have units of velocity. With this in mind, the coefficient may be divided:

We may define another new variable, a nondimensional velocity:

Substituting this into the equation:

It's finally time to ask an important question: Why?

There are many benefits. The original problem involved 4 parameters: viscosity, density, pressure gradient, and wall separation distance. In this fully nondimensionalized solution, there happens to be none such parameters. The shortened equation above completely describes the behavior of the solution, it contains all of the relevant information.

The solution of one nondimensional problem is far more useful then the solution of a specific dimensional problem. This is especially true if the problem only yields a numeric solution: solving the nondimensional problem greatly reduces the number of charts and graphs that need to be made since you've reduced the number of parameters that could affect the solution.

This leads to another important question, and the culmination of this chapter: here, we first solved a generic dimensional problem and then nondimensionalized it. This luxury wouldn't be available with a more complicated problem. Could it have been nondimensionalized beforehand? Yep. Recalling the BVP:

Note that varies from to in the domain we're interested in. For this reason, it is natural to scale with :

Note that we could have just as well scaled with numbers like or e10.0687 D and still ended up with nondimensionalized and everything mathematically sound. However, alone was the best choice since the resulting variable would vary from to . With this choice of scale, the variable is called normalized in addition to being nondimensional; being normalized is a desirable attribute for mathematic simplicity, accurate numeric evaluation, sense of scale, and other reasons.

What about ? The character of was known, the same can't be said for (why are we solving problems in the first place?). Let's come up with a name for the unknown scale of , say , and normalize using this unknown constant:

Using the chain rule, the new variables may be put into the ODE:

So we have our derivative now. It may be substituted into the ODE:

Remember that was some constant pulled out of thin air. Hence, it can be anything we want it to be. To nondimensionalize the equation and simplify it as much as possible, we may pick:

So that the ODE will become:

The BCs are homogeneous, so they simplify easily. Noting that when :

This may now be quickly solved:

So this isn't quite the same as the nondimensional solution developed from the dimensional solution: there's a factor of on the right side. Consequently, is missing the . It's not a problem, both developments solve the problem and nondimensionalize it. Note that in doing this, we got the following result before even solving the BVP:

This tells much about the size of the velocity.

Before closing this chapter, it's worth mentioning that, generally, if and , where , , , and are all constants,

Leibniz notation is definitely a good thing.


Details and Applications of Fourier Series

Details and Applications of Fourier Series edit

In the study of PDEs (and much elsewhere), a Fourier series (or more generally, trigonometric expansion) often needs to be constructed.

Preliminaries edit

Suppose that a function f(x) may be expressed in the following way:

It can be shown (not too difficult, but beyond this text) that the above expansion will converge to f(x), except at discontinuities, if the following conditions hold:

  • f(x) = f(x + 2L), i.e. f(x) has period 2L .
  • f(x), f'(x), and f''(x) are piecewise continuous on the interval -LxL.
  • The pieces that make up f(x), f'(x), and f''(x) are continuous over closed subintervals.

The first requirement is most significant; the last two requirements can, to an extent, be partly eased off in most cases without any trouble. An interesting thing happens at discontinuities. Suppose that f(x) is discontinuous at x = a; the expansion will converge to the following value:

So the expansion converges to the average of the values to the left and the right of the discontinuity. This, and the fact that it converges in the first place, is very convenient. The Fourier series looks unfriendly but it's honestly working for you.

The information needed to express f(x) as a Fourier series are the sequences An and Bn. This is done using orthogonality, which for the sinusoids may be derived easily using a few identities. The following are some useful orthogonality relations, with m and n restricted to integers:

δm,n is called the Kronecker delta, defined by:

The Kronecker delta may be thought of as a discrete version of the Dirac delta "function". Relevant to this topic is its sifting property:

Derivation of the Fourier Series edit

We're now ready to find An and Bn.

This is supposed to hold for an arbitrary integer m. If m = 0, note that the sum doesn't allow n = 0 and so the sum would be zero since in no case does m = n. This leads to:

This secures A0. Now suppose that m > 0. Since m and n are now in the same domain, the Kronecker delta will do its sifting:

In the second to the last step, sin(mπ) = 0 for integer m. In the last step, m was replaced with n. This defines An for n > 0. For the case n = 0,

Which happens to match the previous development (now you know why it's A0/2 and not just A0). So the sequence An is now completely defined for any value of n of interest:

To get Bn, nearly the same routine is used.

The Fourier series expansion of f(x) is now complete. To have it all in one place:

f(x): a square wave.

It's finally time for an example. Let's derive the Fourier series representation of a square wave, pictured at the right:

This "wave" may be quantified as f(x):

f(x) has period 2. Since 2L = P, L = 1. Now, finding the Fourier coefficients:

Successive approximations (partial sums) of f(x).

In the last bit we used the fact that all of the even terms happened to be absent, and the odd numbers are given by 2n - 1 for integer n. The sum will indeed converge to the square wave, except at the discontinuities where it'll converge to zero (the average of 1 and -1).

Graphs of partial sums are shown at right. Note that this particular expansion doesn't converge too quickly, and that as an approximation of the square wave it's poorest near the discontinuities.

There's another interesting thing to note: all of the cosine terms are absent. It's no coincidence, and this may be a good time to introduce the Fourier sine and cosine expansions for, respectively, odd and even functions.

Periodic Extension and Expansions for Even and Odd functions edit

Two important expansions may be derived from the Fourier expansion: the Fourier sine series and the Fourier cosine series, the first one was used in the previous section. Before diving in, we must talk about even and odd functions.

Suppose that feven(x) is an even function and fodd(x) is an odd function. That is:

Some interesting identities hold for such functions. Relevant ones include:

This is all very relevant to Fourier series. Suppose that an even function is expanded. Recall that sine is odd and cosine is even. Then:

(whole integrand is even)
(whole integrand is odd)

So the Fourier cosine series (note that all sine terms disappear) is just the Fourier series for an even function, given as:

A Fourier expansion may be similarly built for an odd function:

(whole integrand is odd)
(whole integrand is even)

And the Fourier sine series is:

At this point, the periodic extension may be considered. In the previous chapter, the problem mandated a sine expansion of a parabola. A parabola is by no means a periodic function, and yet a Fourier sine expansion was done on it. What actually happened was that the function was expanded as expected within its domain of interest: the interval 0 ≤ x ≤ 1. Inside this interval, the expansion truly is a parabola. Outside this interval, the expansion is periodic, and as a whole is odd (just like the sine functions it's built on).

The parabola could've been expanded just as well using cosines (resulting in an even expansion) or a full Fourier expansion on, say, -1 ≤ x ≤ 1.

Note that we weren't able to pick which expansion to use, however. While the parabola could be expanded any way we want on any interval we want, only the sine expansion on 0 ≤ x ≤ 1 would solve the problem. The ODE and BCs together picked the expansion and the interval. In fact, before the expansion was even constructed we had:

Which is a Fourier sine series only at t = 0. That the IC was defined at t = 0 allowed the expansion. For t > 0, the solution has nothing in common with a Fourier series.

What's trying to be emphasized is flexibility. Knowledge of Fourier series makes it much easier to solve problems. In the parallel plate problem, knowing what a Fourier sine series is motivates the construction of the sum of un. In the end it's the problem that dictates what needs to be done. For the separable IBVPs, expansions will be a recurring nightmare theme and it is most important to be familiar and comfortable with orthogonality and its application to making sense out of infinite sums. Many functions have orthogonality properties, including Bessel functions, Legendre polynomials, and others.

The keyword is orthogonality. If an orthogonality relation exists for a given situation, then a series solution is easily possible. As an example, the diffusion equation used in the previous chapter can, with sufficiently ugly BCs, require a trigonometric series solution that is not a Fourier series (non-integer, not evenly spaced frequencies of the sinusoids). Sturm-Liouville theory rescues us in such cases, providing the right orthogonality relation.


Finite Difference Method

Finite Difference Method edit

The finite difference method is a basic numeric method which is based on the approximation of a derivative as a difference quotient. We all know that, by definition:

The basic idea is that if is "small", then

Similarly,

It's a step backwards from calculus. Instead of taking the limit and getting the exact rate of change, we approximate the derivative as a difference quotient. Generally, the "difference" showing up in the difference quotient (ie, the quantity in the numeriator) is called a finite difference which is a discrete analog of the derivative and approximates the derivative when divided by .

Replacing all of the derivatives in a differential equation ditches differentiation and results in algebraic equations, which may be coupled depending on how the discretization is applied.

For example, the equation

may be discretized into:

This discretization is nice because the "next" value (temporally) may be expressed in terms of "older" values at different positions.


Method of Lines

Method of Lines edit

Introduction edit

The method of lines is an interesting numeric method for solving partial differential equations. The idea is to semi-discretize the PDE into a huge system of (continuous and interdependent) ODEs, which may in turn be solved using the methods you know and love, such as forward stepping Runge-Kutta. This is where the name "method of lines" comes from: the solution is composed of a juxtaposition of lines (curves, more properly).

The method of lines is applicable to IBVPs only. The variables (or variable) that have boundary constraints are discretized, and the (one) variable that is associated with the initial value(s) becomes the independent variable for the ODE solution. Put more formally, the method of lines is obtained when all of the variables are discretized except for one of them, resulting in a coupled system of ODEs.

Pure BVPs, such as Laplace's equation on some terrible boundary, can be solved in a manner similar to Jacobi iteration known as the method of false transients.

Example Problem edit

Consider the following nonlinear IBVP, where :

This intimidating nondimensional system describes the flow of liquid in an interior corner ("groove"), where is the height of the liquid and is the distance along the axis of the corner. The IBVP states that the corner is initially empty, and that at one end () the height is fixed to 1. Naturally, fluid will flow along , increasing until the other boundary (), where is always zero, is reached. It's worth noting that though this IBVP can't be solved analytically, similarity solutions to the nonlinear PDE exist for some other situations, such as fixed height at z = 0 and no constraint to the right of that.

The boundary values are associated with the spatial variable, . Suppose that is discretized in space but not time to points on a uniform grid. Then is approximated as follows:

So is a sequence (or vector) that has an index instead of a continuous dependence on ; note however that the whole sequence still has a continuous dependence on time. is an index that runs from 0 to , note that zero based indexing is being used, so that corresponds to and corresponds to . Looking at the boundaries, we know that:

What about the points of the sequence inside the boundaries? Initially (at ),

Suppose now that the right side of the PDE is discretized however you like, so that:

If, for example, central differences were used, one would eventually arrive at:

To construct the method of lines approximation for this problem, first is replaced with , and then the right side of the equation is replaced with its discretization (the exact equality is a slight abuse of notation):

Since depends continuously on time and nothing else, this differential equation becomes:

Putting everything together in one place, the solution to is the solution to the following IVP:

Solving this problem gives the approximation for . Note that if a second order forward stepping method is used, such as second order Runge Kutta, this solution method will have approximately the same accuracy as the Crank-Nicolson method, but without simultaneous solution of a mess of algebraic equations, which would make a nonlinear problem prohibitively difficult since a tidy matrix solution wouldn't work.

The method of lines is especially popular in electromagnetics, for example, using the Helmholtz equation to simulate the passage of light through a lens for better lens design; the reason for the popularity is that the system of ODEs can (in this case) be solved analytically, so that the accuracy is limited only by the spatial discretization.

A word of caution: an explicit forward stepping method of lines solution bears similarity to a forward stepping finite difference method; so there is no reason to believe that the method of lines doesn't suffer the same stability issues.

A Third Order TVD RK Scheme in C edit

What follows is an efficient TVD RK scheme implemented in C. Memory is automatically allocated per need (but it will just assume that there is no problem in memory allocation) and freed when n = 0. Note that the solution will not be TVD (total variation diminishing) unless the discretization of is TVD also.

This isn't strictly a method of lines solver, of course; it may find use whenever some large, interdependent vector ODE must be stepped forward.

/*
Third order TVD autonomous RK solution. Steps x'(t) = f(x) forward, where x is some vector of size n.
 
x is the address of the first element. x must be an array of size n. f is a function as shown above.
It's fed the address of the first element of x and the index of the element being worked on, n.
Use n = 0 to free memory.
*/
void RK3(double (*f)(double *x, unsigned int i, unsigned int n), double *x, unsigned int n, double delta_t){
 	static double *x1 = NULL, *x2 = NULL;
	static unsigned int N = 0;
	if(n == 0){
		free(x1);
		free(x2);
		x1 = NULL;
		x2 = NULL;
		return;
	}
	if(n > N){
		N = n;
		x1 = realloc(x1, n*sizeof(double));
		x2 = realloc(x2, n*sizeof(double));
	}

	unsigned int i;

	for(i = 0; i != n; i++)
		x1[i] = x[i] + delta_t*f(x, i, n);

	for(i = 0; i != n; i++)
		x2[i] = 3.0/4.0*x[i] + 1.0/4.0*(x1[i] + delta_t*f(x1, i, n));

	for(i = 0; i != n; i++)
		x[i] = 1.0/3.0*x[i] + 2.0/3.0*(x2[i] + delta_t*f(x2, i, n));

}

Fortran 90 equivalent:

subroutine RK3(f, x, n, delta_t, pass)
   double precision, external:: f
   integer         :: n
   double precision:: x(n)
   double precision:: delta_t
   double precision:: pass(*) !-- Parameters for the routine f()
   ! Locals
   double precision:: x1(n), x2(n)

   if (n .eq. 0) return
   x1 = x + delta_t *  f(x, n, pass)
   x2 = 3d0/4d0 * x + 1d0/4d0 * (x1 + delta_t * f(x1, n, pass)) 
   x  = 1d0/3d0 * x + 2d0/3d0 * (x2 + delta_t * f(x2, n, pass)) 
end subroutine RK3

f() should be defined as

function f(x, n, pass) result (r)
integer:: n
double precision:: f(n)
double precision:: pass(*)
double precision:: r(n)
! START OF CODE


Scale Analysis

Scale Analysis edit

In the chapter on nondimensionalization, variables (both independent and dependent) were nondimensionalized and at the same time scaled so that they ranged from something like to . "Something like to " is the mentality.

Scale analysis is a tool that uses nondimensionalization to:

  • Understand what's important in an equation and, more importantly, what's not.
  • Gain insight into the size of unknown variables, such as velocity or temperature, before (even without) actually solving.
  • Simplify solution process (nondimensional variables ranging for to are very amiable).
  • Reduce dependence of the solution on physical parameters.
  • Allow higher quality numeric solution since variables that are of the same range maintain accuracy better on a computer.

Scale analysis is very common sense driven and not too systematic. For this reason, and since it is somewhat unnatural and hard to describe without endless example, it may be difficult to learn.

Before going into the concept, we must discuss orders of magnitude.

Orders of Magnitude and Big O Notation edit

Suppose that there are two functions and . It is said (and notated) that:

Visualization of the limit superior and limit inferior of f(x) as x increases without bound.

It's worth understanding fully this possibly obscure definition. , short for limit superior, is similar to the "regular" limit, only it is the limit of the upper bound. This concept, alongside limit inferior, is illustrated at right. This intuitive analysis will have to suffice here, as the precise definition and details of these special limits are rather complicated.

As a further example, the limits of the cosine function as increases without bound are:

With this somewhat off topic technicality hopefully understood, the statement that:

Is saying that near , the order (or size, or magnitude) of is bounded by . It's saying that isn't crazily bigger then near , and this is precisely notated by saying that the limit superior is bounded (the "regular" limit wouldn't work since oscillations would ruin everything). The notation involving the big O is rather surprisingly called "big O notation", it's also known as Landau notation.

Take, for example, at different points:

In the first case, the term will easily dominate for large . Even if the coefficient on that term is very near zero, for large enough x that term will dominate. Hence, the function is of order for large .

In the second case, near the first two terms are limiting to zero while the constant term, , isn't changing at all. It is said to be of order , notated as order above. Why O(1) and not O(2)? Both are correct, but O(1) is preferred since it is simpler and more similar to .

This may put forth an interesting question: what would happen if the constant term was dropped? Both of the remaining terms would limit to zero. Since we are looking at x near zero and not at zero,

This is because as approaches zero, the quadratic term gets smaller much faster then the linear term. It would also be correct, though kind of useless, to call the quantity O(1). It would be incorrect to state that the quantity is of order zero since the limit would not exist, not under any circumstance.

As implied above, is by no means a unique function. All of the following statements are true, simply because the limit superior is bounded:

While technically correct, these are very misleading statements. Normally, the simplest, smallest magnitude function g(x) is selected.

Before ending the monotony, it should also be mentioned that it's not necessary for to be smaller then near , only the limit superior must exist. The following two statements are also true:

But again, these are misleading and it's most proper to state that:

A relatively simple concept has been beaten to death, to the point of being confusing. It'll be more clear in context, and it'll be used more in later chapters for different purposes.

Scale Analysis on a Two Term ODE edit

Previously, the following BVP was considered:

Wipe away any memory of solving this simple problem, the concepts of this chapter do not look at the actual solution. The variables are nondimensionalized by defining new variables:

So that is scaled by , and is scaled by an unknown scale . Now note that, thanks to the scaling:

These are both true near zero. will be O(1) (this is read "of order one") when its scale is properly chosen. Using the chain rule, the ODE was turned into the following:

Now, if both and are of order one, then it is reasonable to assume that, at least at some point in the domain of interest:

This is by no means guaranteed to be true, however it is reasonable.

To identify the velocity scale, we can set the derivative equal to one and solve. There is nothing "illegal" about purposely setting the derivative equal to one since all we need is some equation to specify an unknown constant, . There is much freedom in defining this scale, because what this constant is and how it's found has no effect on the validity of the solution of the BVP (as long as it's not something stupid like ).

Since:

It follows that:

This velocity scale may be thought of as a characteristic velocity. It's a number that shows us what to expect the velocity to be like. The velocity could actually be larger or smaller, but this gives a general idea. Furthermore, this scale tells us how changing various physical parameters will affect the velocity; there are four of them summarized into one constant.

Compare this result to the coefficient (underlined) on the complete solution, with u dimensional and nondimensional:

They differ by a factor of , but they are of the same order of magnitude. So, indeed, characterizes the velocity.

Words like "reasonable" and "assume" were used a few times, words that would normally lead to the uglier word "approximate". Relax: the BVP itself hasn't been approximated or otherwise violated in any way. We just used scale analysis to pick a velocity scale that:

  • Turned the ODE into something very easy to look at:
  • Gained good insight into what kind of velocity the solution will produce without finding the actual solution.

Note that a zero pressure gradient can no longer show itself in the ODE. This is by no means a restriction, since a zero pressure gradient would result in a zero velocity scale which would unconditionally result in zero velocity.

Scale Analysis on a Three Term PDE edit

The last section was still more of nondimensionalization then it was scale analysis. To just begin getting deeper into the subject, we'll consider the pressure driven transient parallel plate IBVP, identical to the above only with a time component:

See the change of variables chapter to recall the origins of this problem. Scales are defined as follows:

Again, the scale on is picked to make it an order one quantity (based on the BCs), and the scales on and are just letters representing unknown quantities.

The chain rule has been used to define derivatives in terms of the new variables. Instead of taking this path, recall that, given variables and (for the sake of example) and their respective scales and :

So that makes things much easier. Performing the change of variables:

In the previous section, there was one unknown scale and one equation, so the unknown scale could be easily and uniquely isolated. Now, there are two unknown scales but only one equation (no, the BCs/IC will not help). What to do?

The physical meaning of scales may be taken into consideration. Ask: "What should the scales represent?"

There is no unique answer, but good answers for this problem are:

  • characterizes the steady state velocity.
  • characterizes the response time: the time to establish steady state.

Once again, these are picked (however, for this problem there really aren't any other choices). In order to determine the scales, the physics of each situation is considered. There may not be unique choices, but there are best choices, and these are the "correct" choices. An understanding of what each term in the PDE represents is vital to identifying these "correct" choices, and this is notated below:

For the velocity scale, a steady state condition is required. In that case, the time derivate (acceleration) must small. We could obtain the characteristic velocity associated with a steady state condition by requiring that the acceleration be something small (read: zero), stating that the second derivative is , and solving:

This is the same as the velocity scale found in the previous section. This is expected since both situations are describing the same steady state condition. The neglect of acceleration equates to what's called a balance between driving force and viscosity since driving force and viscosity are all that remain.

Getting the time scale may be a little more elusive. The time associated with achieving steady state is dictated by the acceleration and the viscosity, so it follows that the time scale may be obtained by considering a balance between acceleration and viscosity. Note that this statement has nothing to do with pressure, so it should apply to a variety of disturbances. To balance the terms, pretend that the derivatives are quantities and disregard the pressure:

This is a statement that:

  • The smaller the viscosity, the longer you wait for steady state to be achieved.
  • The smaller the separation distance, the less you wait for steady state to be achieved.

Hence, the scale describes what will affect the transient time and how. The results may seem counterintuitive, but they are verified by experiment if the pressure is truly a constant capable of combating possibly huge viscosity forces for a high viscosity fluid.

Compare these scales to constants seen in the full, dimensional solution:

The velocity scales match in order of magnitude, nothing new there. But examine the time constant (extracted from the exponential factor) and compare to the time scale:

They are of the same order with respect to the physical parameters, though they'll differ by nearly a factor of 10 when . This result is more useful then it looks. Note that after determining the velocity scale, all three terms of the equation may have been considered to isolate a time scale. This would've been a poor choice that wouldn't have agreed with the time constant above since it wouldn't be describing the required settling between viscosity and acceleration.

Suppose that, for some problem, a time dependent PDE is too hard to solve, but the steady state version is easier and it is what you're interested in. A natural question would be: "How long do I wait until steady state is achieved?"

The time scale provided by a proper scale analysis will at least give an idea. In this case, assuming that the first term of the sum in the solution is dominant, the time scale will overestimate the response time by nearly a factor of 10, which is priceless information if you're otherwise clueless. This overestimate is actually a good (safe) overestimate, it's always better to wait longer and be certain of the steady state condition. Scales in general have a tendency to overestimate.

Before closing this section, consider the actual nondimensionalization of the PDE. During the scale analysis, the coefficients of the last two terms were equated and later the coefficients of the first two terms were equated. This implies that the nondimensionalized PDE will be:

And this may be verified by substituting the expressions found for the scales into the PDE. This dimensionless PDE, too, turned out to be completely independent of the physical parameters involved, which is very convenient.

Heat Flow Across a Thin Wall edit

Now, an important utility of scale analysis will be introduced: determining what's important in an equation and, better yet, what's not.

As mentioned in the introduction to the Laplacian, steady state heat flow in a homogeneous solid may be described by, in three dimensions:

Now, suppose we're interested in the heat transfer inside a large, relatively thin wall, with differing temperatures (not necessarily uniform) on different sides of the wall. The word 'thin' is crucial, write it down on your palm right now. You should suspect that if the wall is indeed thin, the analysis could be simplified somehow, and that's what we'll do.

Not caring about what happens at the edges of the wall, a BVP may be written:

is the thickness of the wall (implication: is the coordinate across the wall). Suppose that the wall is a boxy object with dimensions . Using the box dimensions as scales,

Only the scale of is unknown. Substituting into the PDE,

Note that the scale on divided out — so a logical choice must be made for it's scale; in this case it'd be an extreme boundary value (ie, the maximum value of ), let's say it's chosen and taken care of. Thanks to this scaling and the rearrangement that followed, we may get a good idea of the magnitude of each term in the equation:

Each derivative is approximately . But what about the squared ratio of dimensions? This is called a dimensionless parameter. Look at your palm now (the one you don't write with), recall the word "thin". "Thin" in this case means exactly the same thing as:

And if the ratio above is much smaller then , then the square of this ratio is even smaller. Our dimensionless parameter is called a small parameter. When a parameter is small, there are many opportunities to simplify analysis; the simplest would be to state that it's too small to matter, so that:

What was just done couldn't have been justified without scaling variables so that their derivatives are (likely) , since you have no idea what order they are otherwise. We know that each derivative is hopefully , but some of these derivatives carry a very small factor. Only then can terms be righteously dropped. The dimensionless BVP becomes:

Note that it's still a partial differential equation (the and varialbes haven't been made irrelevant – look at the BCs). Also note that scaling on is undone since it cancels out anyway (the scale could've still been picked as, say, a maximum boundary value). This problem may be solved very simply by integrating the PDE twice with respect to , and then considering the BCs:

and are integration "constants". The first BC yields:

And the second:

The solution is:

It's just saying that the temperature varies linearly from one wall face to the other. It's worth noting that in practice, once scaling is complete, the hats on variables are "dropped" for neatness and to prevent carpal tunnel syndrome.

Words of Caution edit

Failure of one dimensional flow approximation.

"Extreme caution" is more fitting.

In the wall heat transfer problem, we took the partial derivatives in and to be , and this was justified by the scaling: , and are , so the derivatives must be so as well. Right?

Not necessarily. That they're is a linear approximation, however if the function is significantly nonlinear with respect to a variable of interest, then the derivatives may not be as as thought. In this problem, one way that this can happen is if the temperature at each wall face (the functions and ) have large and differing Laplacians. This will result in three dimensional heat conduction.

Examine carefully the image at right. Suppose that side length is ten times the wall thickness; and have zero Laplacians everywhere except along circles where temperatures suddenly change. At these locations, the Laplacian can be huge (unbounded if the sudden changes are discontinuities). This will suggest that the derivatives in question are not O(1) but much greater, so that these terms become important even though in this case:

Which is as required by the scale analysis: the wall is clearly thin. But apparently, the small thinness ratio multiplied by the large derivatives leads to significant quantities.

Both the exact solution and the solution to the problem approximated through scaling are shown at the location of a cutting plane. The exact solution shows at least two dimensional heat transfer, while the solution of the simplified solution shows only one dimensional heat transfer and is substantially different.

It's easy to see why the 1D approximation fails even without knowing what a Laplacian is: this is a heat transfer problem involving the diffusion of temperature, and the temperature will clearly need to diffuse along near the sudden changes within the wall (can't say the same about the BCs since they're fixed).

The caption of the figure starts with the word "failure". Is it really a failure? That depends on what you're looking for, it may or may not be. Note that if the wall were even thinner and the sudden jumps not discontinuities, the exact and 1D solutions could again eventually become indistinguishable.



Eigenfunction Expansion

Non-homogeneous partial differential equations can be solved via the method of eigenfunction expansion.