Last modified on 22 October 2014, at 09:06

Numerical Methods/Equation Solving

Solution of Algebraic and Transcendental EquationsEdit

An equation of the type f(x)=0 is either algebraic or transcendental.

E.g, these equations are algebraic.

2x=5 \quad x^2+x=1 \quad x^7=x(1+2x)

and these are transcendental

 \sin x = 0 \quad \quad e^\sqrt{x} = \pi \quad \tan x = x

While roots can be found directly for algebraic equations of fourth order or lower, and for a few special transcendental equations, in practice we need to solve equations of higher order and also arbitrary transcendental equations.

As analytic solutions are often either too cumbersome or simply do not exist, we need to find an approximate method of solution. This is where numerical analysis comes into the picture.

Some Useful ObservationsEdit

  • The total number of roots an algebraic equation can have is the same as its degree.
  • An algebraic equation can have at most as many positive roots as the number of changes of sign in f(x).
  • An algebraic equation can have at most as many negative roots as the number of changes of sign in f(-x).
  • In an algebraic equation with real coefficients, complex roots occur in conjugate pairs
  • If f(x)=a_0x^n+a_1x^{n-1}+a_2x^{n-2}+...+a_{n-1}x+a_n with roots \alpha_1, \alpha_2, ..., \alpha_n then the following hold good:
    • \sum_i\alpha_i=\frac{-a_1}{a_0}
    • \sum_{i<j}{\alpha_i\alpha_j}=\frac{a_2}{a_0}
    • \prod_i\alpha_i=(-1)^n \frac{a_n}{a_0}
  • If f(x) is continuous in the interval [a,b] and f(a)f(b)<0 then a root must exist in the interval (a,b)

Initial ApproximationEdit

The last point about the interval is one of the most useful properties numerical methods use to find the roots. All of them have in common the requirement that we need to make an initial guess for the root. Practically, this is easy to do graphically. Simply plot the equation and make a rough estimate of the solution. Analytically, we can usually choose any point in an interval where a change of sign takes place. However, this is subject to certain conditions that vary from method to method.

ConvergenceEdit

A numerical method to solve equations will be a long process. We would like to know, if the method will lead to a solution (close to the exact solution) or will lead us away from the solution. If the method, leads to the solution, then we say that the method is convergent. Otherwise, the method is said to be divergent.i.e, in case of linear and non linear interpolation convergence means tends to 0.

Rate of ConvergenceEdit

Various methods converge to the root at different rates. That is, some methods are slow to converge and it takes a long time to arrive at the root, while other methods can lead us to the root faster. This is in general a compromise between ease of calculation and time.

For a computer program however, it is generally better to look at methods which converge quickly. The rate of convergence could be linear or of some higher order. The higher the order, the faster the method converges.

If e_i is the magnitude of the error in the ith iteration, ignoring sign, then the order is n if \frac{e_{i+1}}{e_i^n} is approximately constant.

It is also important to note that the chosen method will converge only if e_{i+1}<e_i.

Bisection MethodEdit

This is one of the simplest methods and is strongly based on the property of intervals. To find a root using this method, the first thing to do is to find an interval [a,b] such that f(a) \cdot f(b) < 0 . Bisect this interval to get a point (c,f(c)). Choose one of a or b so that the sign of f(c) is opposite to the ordinate at that point. Use this as the new interval and proceed until you get the root within desired accuracy.

ExampleEdit

Solve xe^x-3=0 correct up to 2 decimal places.

f(x)=4x^4+3x^3+2x-7

\Rightarrow f(2) \cdot f(3) < 0 \Rightarrow \epsilon = \frac{1}{2}10^{-2} \Rightarrow i = \frac{2.3010}{0.3010} = 8 \Rightarrow x_1=2.5

f(x_1)=5.625 > 0

\Rightarrow x_2 = 2.25 \Rightarrow ... x_8 = 2.09

Error AnalysisEdit

The maximum error after the ith iteration using this process will be given as

\epsilon_i=\frac{|b-a|}{2^i}

\Rightarrow i\ge\frac{[\log(b-a)-\log\epsilon_i]}{\log 2}

As the interval at each iteration is halved, we have \frac{\epsilon_{i+1}}{\epsilon_i} = \frac{1}{2}. Thus this method converges linearly.

If we are interested in the number of iterations the Bisection Method needs to converge to a root within a certain tolerance than we can use the formula for the maximum error.

ExampleEdit

How many iterations do you need to get the root if you start with a = 1 and b = 2 and the tolerance is 10−4?

The error \epsilon_i needs to be smaller than 10−4. Use the formula for the maximum error:

\epsilon_i = 2^{-i} \cdot (2-1) < 10^{-4}
\epsilon_i = 2^{-i} < 10^{-4}

Solve for i using log rules

\displaystyle\log_{10}{2^{-i}} < \log_{10}{10^{-4}}
\displaystyle -i \cdot \log_{10}{2} < -4
\displaystyle i > \frac{4}{\log_{10}{2}} = 13.29 = 14

Hence 14 iterations will ensure an approximation to the root accurate to \displaystyle 10^{-4}. Note: the error analysis only gives a bound approximation to the error; the actual error may be much smaller.

False Position MethodEdit

The false position method (sometimes called the regula falsi method) is essentially same as the bisection method -- except that instead of bisecting the interval, we find where the chord joining the two points meets the X axis. The roots are calculated using the equation of the chord, i.e. putting y = 0 in

y-f(x_0)=\frac{f(x_1)-f(x_0)}{x_1-x_0}(x-x_0).

The rate of convergence is still linear but faster than that of the bisection method. Both these methods will fail if f has a double root.

ExampleEdit

Consider f(x)=x2-1. We already know the roots of this equation, so we can easily check how fast the regula falsi method converges.

For our initial guess, we'll use the interval [0,2].

Since f is concave upwards and increasing, a quick sketch of the geometry shows that the chord will always intersect the x-axis to the left of the solution. This can be confirmed by a little algebra.

We'll call our nth iteration of the interval [an, 2]

The chord intersects the x-axis when

-(a_n^2-1)=\frac{(2^2-1)-(a_n^2-1)}{2-a_n} (x-a_n)

Rearranging and simplifying gives

x=\frac{1+2a_n}{2+a_n}

Since this is always less than the root, it is also an+1

The difference between an and the root is en=an-1, but

e_{n+1}=\frac{1+2a_n}{2+a_n}-1 = \frac{a_n-1}{a_n+2} =
\frac{1}{a_n+2}e_n

This is always smaller than en when an is positive. When an approaches 1, each extra iteration reduces the error by two-thirds, rather than one-half as the bisection method would.

The order of convergence of this method is 2/3 and is linear.

In this case, the lower end of the interval tends to the root, and the minimum error tends to zero, but the upper limit and maximum error remain fixed. This is not uncommon.

Fixed Point Iteration (or Staircase method)Edit

If we can write f(x)=0 in the form x=g(x), then the point x would be a fixed point of the function g (that is, the input of g is also the output). Then an obvious sequence to consider is

x_{n+1}=g(x_n) \,

If we look at this on a graph we can see how this could converge to the intersection.

Any function can be written in this form if we define g(x)=f(x)+x, though in some cases other rearrangements may prove more useful.

Error analysisEdit

We define the error at the nth step to be

e_n=x_n-x \mbox{ where } x=g(x) \,

Then we have

\begin{matrix}
e_{n+1} & = & x_{n+1}-x & \\
& = & g(x_n)-x & \\
& = & g(x+e_n) -x & \\
& = & -x + g(x)+ e_n g^\prime(x) + \cdots & \\
& = & e_n g^\prime(x) + \cdots & \mbox{since } g(x)=x
\end{matrix}

So, when |g′(x)|<l, this sequence converges to a root, and the error will be approximately proportional to n.

Because the relationship between en+1 and en is linear, we say that this method converges linearly, if it converges at all.

When g(x)=f(x)+x this means that if

x_{n+1}=x_n+f(x_n) \,.

converges to a root, x, of f then

 -2 < f^\prime(x) < 0 \,

Note that this convergence will only happen for a certain range of x. If the first estimate is outside that range then no solution will be found.

Also note that although this is a necessary condition for convergence, it does not guarantee convergence. In the error analysis we neglected higher powers of en, but we can only do this if en is small. If our initial error is large, the higher powers may prevent convergence, even when the condition is satisfied.

If |g′(x)|<1 is true at the root, the iteration sequence will converge in some interval around the root, which may be smaller than the interval where |g′(x)|<1. If |g′(x)| isn't smaller than one at the root, the iteration will not converge to that root.

ExampleEdit

Lets consider f(x)=x^3+x-2, which we can see has a single root at x=1. There are several ways f(x)=0 can be written in the desired form, x=g(x).

The simplest is

1):x_{n+1}=x_n+f(x_n)=x_n^3 +2x_n -2

In this case, g'(x)=3x^2+2, and the convergence condition is

1>|g'(x)|=3x^2+2, \qquad -1>3x^2

Since this is never true, this doesn't converge to the root.

2)An alternate rearrangement is

x_{n+1}=2-x_n^3

This converges when

1 > |g'(x)|= |-3x^2|, \qquad x^2 < \frac{1}{3},
\qquad |x| < \frac{1}{\sqrt{3}}

Since this range does not include the root, this method won't converge either.

3)Another obvious rearrangement is

x_{n+1}=\sqrt[3]{2-x_n}

In this case the convergence condition becomes

\frac{1}{3}\left| (2-x_n)^{-\frac{2}{3}} \right| <1, \qquad
\left(2-x_n \right)^{-2} < 3^3, \qquad
|x_n-2|>\sqrt{27}

Again, this region excludes the root.

4)Another possibility is obtained by dividing by x2+1

x_{n+1}=\frac{2}{x_n^2+1}

In this case the convergence condition becomes

\frac{4|x|}{(1+x^2)^2}<1, \qquad 4|x| < (1+x^2)^2

Consideration of this inequality shows it is satisfied if x>1, so if we start with such an x, this will converge to the root.

Clearly, finding a method of this type which converges is not always straightforwards.

Newton-RaphsonEdit

In numerical analysis, Newton's method (also known as the Newton–Raphson method or the Newton–Fourier method) is an efficient algorithm for finding approximations to the zeros (or roots) of a real-valued function. As such, it is an example of a root-finding algorithm.

Any zero-finding method (Bisection Method, False Position Method, Newton-Raphson, etc.) can also be used to find a minimum or maximum of such a function, by finding a zero in the function's first derivative, see Newton's method as an optimization algorithm.

Description of the methodEdit

The idea of the Newton-Raphson method is as follows: one starts with an initial guess which is reasonably close to the true root, then the function is approximated by its tangent line (which can be computed using the tools of calculus), and one computes the x-intercept of this tangent line (which is easily done with elementary algebra). This x-intercept will typically be a better approximation to the function's root than the original guess, and the method can be iterated. Suppose f : [a, b] → R is a differentiable function defined on the interval [a, b] with values in the real numbers R. The formula for converging on the root can be easily derived. Suppose we have some current approximation xn. Then we can derive the formula for a better approximation, xn+1 by referring to the diagram on the right. We know from the definition of the derivative at a given point that it is the slope of a tangent at that point.

We can get better convergence if we know about the function's derivatives. Consider the tangent to the function:

Tangent is in red; the function itself in blue.

Near any point, the tangent at that point is approximately the same as f('x) itself, so we can use the tangent to approximate the function.

The tangent through the point (xn, 'f(xn)) is

\frac{y-f(x_n)}{x-x_n}=f'(x_n)

The next approximation, xn+1, is where the tangent line intersects the axis, so where y=0. Rearranging, we find

x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}

Error analysisEdit

Again, we define the root to be x, and the error at the nth step to be

e_n=x_n-x \,

Then the error at the next step is

e_{n+1} =  e_n - \frac{f(x)+e_n f'(x)+ \frac{1}{2}e_n^2 f''(x)+ \cdots}
{f'(x)+e_n f''(x)+ \cdots } (1)

where we've written f as a Taylor series round its root, x. Rearranging this, and using f(x)=0, we get

\begin{matrix}
e_{n+1} & = & e_n & - & e_n \left( f'(x)+\frac{1}{2}e_n f''(x) \right) /f'(x) + \cdots \\
& = & -\frac{f''(x)}{f'(x)}e_n^2 & + & \cdots  
\end{matrix}

where we've neglected cubic and higher powers of the error, since they will be much smaller than the squared term, when the error itself is small.

Notice that the error is squared at each step. This means that the number of correct decimal places doubles with each step, much faster than linear convergence.

This sequence will converge if

 \left| \frac{f''(x)}{f'(x)}e_n^2 \right| < |e_n|, \qquad
|e_n|< \frac{f'(x)}{f''(x)}

If f′ isn't zero at the root, then there will always be a range round the root where this method converges.

If f′ is zero at the root, then on looking again at (1) we see that we get

e_{n+1} =  e_n /2 \,

and the convergence becomes merely linear.

Overall, this method works well, provided f does not have a minimum near its root, but it can only be used if the derivative is known.

ExampleEdit

Let's consider f(x)=x2-a. Here, we know the roots exactly, so we can see better just how well the method converges.

We have

x_{n+1} = x_n-\frac{f(x_n)}{f'(x_n)} = x_n-\frac{x_n^2-a}{2x_n} 
= \frac{x_n^2+a}{2x_n} = \frac{1}{2}\left( x_n + \frac{a}{x_n} \right)

This method is easily implemented, even with just pen and paper, and has been used to rapidly estimate square roots since long before Newton.

The nth error is en=xn-√a, so we have

\begin{matrix}
e_{n+1} & = & \frac{(e_n+\sqrt{a})^2+a}{2(\sqrt{a}+e_n)} - \sqrt{a} \\
& = & \frac{2a+2e_n\sqrt{a}+e_n^2}{2(\sqrt{a}+e_n)} - \sqrt{a} \\
& = & \frac{2(\sqrt{a}+e_n)\sqrt{a}+e_n^2}{2(\sqrt{a}+e_n)} - \sqrt{a} \\
& = & \frac{e_n^2}{2(\sqrt{a}+e_n)}
\end{matrix}

If a=0, this simplifies to en/2, as expected.

If a>0, en+1 will be positive, provided en is greater than -√a, i.e provided xn is positive. Thus, starting from any positive number, all the errors, except perhaps the first will be positive.

The method converges when,

 |e_{n+1}| = \left| \frac{e_n^2}{2(\sqrt{a}+e_n)} \right| < |e_n|

so, assuming en is positive, it converges when

 e_n < 2 \left( e_n + \sqrt{a} \right)

which is always true.

This method converges to the square root, starting from any positive number, and it does so quadratically.

Higher order methodsEdit

There are methods that converge even faster than Newton-Raphson. e.g

x_{n+1} = x_n-\frac{f(x_n)}{f'(x_n)}
+ \frac{1}{2} \frac{f''(x_n)f^2(x_n)}{{f'}^3(x_n)}

which converge cubicly, tripling the number of correct digits at each iteration, which is 50% faster than Newton-Raphson.

However, if iterating each step takes 50% longer, due to the more complex formula, there is no net gain in speed. For this reason, methods such as this are seldom used.


Main Page - Mathematics bookshelf - Numerical Methods