Numerical Methods/Solution of Linear Equation Systems

Definitions and Basics

A linear equation system is a set of linear equations to be solved simultaneously. A linear equation takes the form

a_{1}*x_{1}+a_{2}*x_{2}+\ldots +a_{n}*x_{n}=b\quad ,\,

where the $n+1\,$ coefficients $a_{0}\ldots a_{n}\,$ and $b\,$ are constants and $x_{1}\ldots x_{n}\,$ are the n unknowns. Following the notation above, a system of linear equations is denoted as

{\begin{matrix}a_{11}*x_{1}+a_{12}*x_{2}+\ldots +a_{1n}*x_{n}&=&b_{1}&\\a_{21}*x_{1}+a_{22}*x_{2}+\ldots +a_{2n}*x_{n}&=&b_{2}&\\&\vdots &&\\a_{m1}*x_{1}+a_{m2}*x_{2}+\ldots +a_{mn}*x_{n}&=&b_{m}&\quad .\end{matrix}}\,

This system consists of $m\,$ linear equations, each with $n+1\,$ coefficients, and has $n\,$ unknowns which have to fulfill the set of equations simultaneously. To simplify notation, it is possible to rewrite the above equations in matrix notation:

\mathbf {A} \cdot \mathbf {x} =\mathbf {b} \quad .\,

The elements of the $m\times n\,$ matrix $\mathbf {A} \,$ are the coefficients of the equations, $a_{ij}\,$ and the vectors $\mathbf {x} \,$ and $\mathbf {b} \,$ have the elements $x_{i}\,$ and $b_{i}\,$ respectively. In this notation each line forms a linear equation.

Over- and Under-Determined Systems

In order for a solution $\mathbf {x} \,$ to be unique, there must be at least as many equations as unknowns. In terms of matrix notation this means that $m\geq n\,$ . However, if a system contains more equations than unknowns ( $m>n$ ) it is very likely (not to say the rule) that there exists no solution at all. Such systems are called over-determined since they have more equations than unknowns. They require special mathematical methods to solve approximately. The most common one is the Least-Squares-Method which aims at minimizing the sum of the error-squares made in each unknown when trying to solve a system. Such problems commonly occur in measurement or data fitting processes.

example
Assume an arbitrary triangle: suppose one measures all three inner angles $\alpha ,\beta ,\gamma$ with a precission of $\pm 0.2^{\circ }\,$ . Furthermore assume that the lengths of the sides a, b and c are known exactly. From trigonometry it is known, that using the law of cosines one can compute the angle or the length of a side if all the other sides and angles are known. But as is known from geometry, the inner angles of a planar triangle always must add up to $180^{\circ }\,$ . So we have three laws of cosines and the rule for the sum of angles. Makes a total of four equations and three unknowns which gives an over-determined problem.

On the other hand, if $m<n\,$ the problem arises, that the solution is not unique, as one unknown can be freely chosen. Again, mathematical methods exist to treat such problems. However, they will not be covered in this text.

This chapter will mainly concentrate on the case where $m=n\,$ and assumes so unless mentioned otherwise.

Exact Solution of Linear Systems

Solving a system $\mathbf {Ax} =\mathbf {b} \,$ in terms of linear algebra is easy: just multiply the system with $\mathbf {A} ^{-1}$ from the left, resulting in $\mathbf {x} =\mathbf {A} ^{-1}\mathbf {b}$ . However, finding $\mathbf {A} ^{-1}$ is (except for trivial cases) very hard. The following sections describe methods to find an exact (up to rounding-errors) solution to the problem.

Diagonal and Triangular Systems

A diagonal matrix has only entries in the main diagonal:

a_{ij}\equiv 0\;\forall \;i\neq j

The inverse of $\mathbf {A}$ in such a case is simply a diagonal matrix with inverse entries, meaning

\mathbf {A} ^{-1}={\mbox{diag}}(1/a_{ii})\quad .

It follows, that a diagonal system has the solution $x_{i}=b_{i}/a_{ii}$ which is very easy to compute.

An upper triangular system is defined as

a_{ij}=0\quad \forall j<i\quad ,

and a lower triangular system as

a_{ij}=0\quad \forall j>i\quad .

Backward-substitution is the process of solving an upper triangular system

x_{i}={\begin{cases}b_{i}/a_{ii}\quad &{\mbox{if}}\;i=N\\{\frac {1}{a_{ii}}}\left(b_{i}-\sum _{j=i+1}^{N}{a_{ij}x_{j}}\right)\quad &{\mbox{else}}\quad .\end{cases}}

Backward-substitution on the other hand is the same procedure for lower triangular systems

x_{i}={\begin{cases}b_{i}/a_{ii}\quad &{\mbox{if}}\;i=1\\{\frac {1}{a_{ii}}}\left(b_{i}-\sum _{j=1}^{i-1}{a_{ij}x_{j}}\right)\quad &{\mbox{else}}\quad .\end{cases}}

Gauss-Jordan Elimination

Instead of finding $\mathbf {A} ^{-1}$ this method relies on row-operations. According to the laws of linear algebra, the rows of an equation system can be multiplied by a constant without changing the solution. Additionaly the rows can be added and subtracted from one another. This leads to the idea of changing the system in such a way that $\mathbf {A}$ has a structure which allows for easy solving for $\mathbf {x}$ . One such structure would be a diagonal matrix as mentioned above.

Gauss-Jordan Elimination brings the matrix $\mathbf {A}$ into diagonal form. To simplify the procedure, one often uses an adapted scheme. First, the matrix $\mathbf {A}$ and the right-hand vector $\mathbf {b}$ are combined into the augmented matrix

\left[\mathbf {A} ,\mathbf {b} \right]={\begin{bmatrix}a_{11}&a_{12}&\cdots a_{1N}&b_{1}\\\vdots &\ddots &\vdots &\vdots \\a_{N1}&a_{N2}&\cdots a_{NN}&b_{N}\end{bmatrix}}

To illustrate, consider an easy to understand, yet efficient algorithm can be built from four basic components:

gelim: the main function iterates through a stack of reduced equations building up the complete solution one variable at a time, through a series of partial solutions.

stack: calls reduce repeatedly, producing a stack of reduced equations, ordered from smallest (2 elements, such as <ax = b>) to largest.

solve: solves for one variable, given a reduced equation and a partial solution. For example given the reduced equation <aw bx cy = d> and the partial solution <x y>, w = (d - bx - cy)/a. Now the partial solution <w x y> is available for the next round, e.g. <au bv cw dx , e>.

reduce: takes the first equation off the top and pushes it onto the stack; then produces a residual - a reduced matrix, by subtracting the elements of the original, first equation from corresponding elements of the remaining, lower equations, e.g. b[j][k]/b[j][0] - a[k]/a[0]. As you can see, this eliminates the first element in each of the lower equations by subtracting one from one, and only the remaining elements need be kept - ultimately, the residual is an output matrix with one less row, and one less column than the input matrix. It is then used as the input for the next iteration.

It should be noted that multiplication could also be used in place of division; however, on larger matrices (e.g. n=10), this has a cascading effect producing NAN's (infinities). The division, looked at statistically, has the effect of normalizing the reduced matrices - producing numbers with a mean closer to zero and a smaller standard deviation; for randomly generated data, this produces reduced matrices with entries in the vicinity of +-1.

Continuation still to be written

As it is, it shows that it isn't necessary to bring a system into full diagonal form. It is sufficient to bring it into triangular (either upper or lower) form, since it can then be solved by backward or forward substitution respectively.

LU-Factorization

This section needs to be written

Approximate Solution of Linear Systems

This section needs to be written

Jacobi Method

It is an iterative scheme.

Gauss-Seidel Method

This section needs to be written

10v+w+x-2y+z=-1
v-20w-2x+y+z=20
v+w+10x-y-z=-1
-v-2w+x+50y+z=2
v+w+x+y+100z=-1 
;find result:

SOR Algorithm

SOR is an abbreviation for the Successive Over Relaxation. It is an iterative scheme that uses a relaxation parameter $\omega$ and is a generalization of the Gauss-Seidel method in the special case $\omega =1$ .

Given a square system of n linear equations with unknown x:

A\mathbf {x} =\mathbf {b}

where:

A={\begin{bmatrix}a_{11}&a_{12}&\cdots &a_{1n}\\a_{21}&a_{22}&\cdots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\a_{n1}&a_{n2}&\cdots &a_{nn}\end{bmatrix}},\qquad \mathbf {x} ={\begin{bmatrix}x_{1}\\x_{2}\\\vdots \\x_{n}\end{bmatrix}},\qquad \mathbf {b} ={\begin{bmatrix}b_{1}\\b_{2}\\\vdots \\b_{n}\end{bmatrix}}.

Then A can be decomposed into a diagonal component D, and strictly lower and upper triangular components L and U:

A=D+L+U,

where

D={\begin{bmatrix}a_{11}&0&\cdots &0\\0&a_{22}&\cdots &0\\\vdots &\vdots &\ddots &\vdots \\0&0&\cdots &a_{nn}\end{bmatrix}},\quad L={\begin{bmatrix}0&0&\cdots &0\\a_{21}&0&\cdots &0\\\vdots &\vdots &\ddots &\vdots \\a_{n1}&a_{n2}&\cdots &0\end{bmatrix}},\quad U={\begin{bmatrix}0&a_{12}&\cdots &a_{1n}\\0&0&\cdots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\0&0&\cdots &0\end{bmatrix}}.

The system of linear equations may be rewritten as:

(D+\omega L)\mathbf {x} =\omega \mathbf {b} -[\omega U+(\omega -1)D]\mathbf {x}

for a constant ω > 1.

The method of successive over-relaxation is an iterative technique that solves the left hand side of this expression for x, using previous value for x on the right hand side. Analytically, this may be written as:

\mathbf {x} ^{(k+1)}=(D+\omega L)^{-1}{\big (}\omega \mathbf {b} -[\omega U+(\omega -1)D]\mathbf {x} ^{(k)}{\big )}.

However, by taking advantage of the triangular form of (D+ωL), the elements of x^(k+1) can be computed sequentially using forward substitution:

x_{i}^{(k+1)}=(1-\omega )x_{i}^{(k)}+{\frac {\omega }{a_{ii}}}\left(b_{i}-\sum _{j>i}a_{ij}x_{j}^{(k)}-\sum _{j<i}a_{ij}x_{j}^{(k+1)}\right),\quad i=1,2,\ldots ,n.

The choice of relaxation factor is not necessarily easy, and depends upon the properties of the coefficient matrix. For symmetric, positive-definite matrices it can be proven that 0 < ω < 2 will lead to convergence, but we are generally interested in faster convergence rather than just convergence.

Conjugate Gradients

This section needs to be written

Multigrid Methods

This section needs to be written

Main Page - Mathematics bookshelf - Numerical Methods