Advanced Mathematics for Engineers and Scientists/Vector Spaces: Mathematic Playgrounds

Vector Spaces: Mathematic Playgrounds

The study of partial differential equations requires a clear definition of what kind of numbers are being dealt with and in what way. PDEs are normally studied in certain kinds of vector spaces, which have a number of properties and rules associated with them which make possible the analysis and unifies many notions.

The Real Field

A field is a set that is bundled with two operations on the set called addition and multiplication which obey certain rules, called axioms. The letter $F$ will be used to represent the field, and from definition a field requires the following ( $a,b,$ and $c$ are in $F$ ):

Closure under addition and multiplication: the addition and multiplication of field members produces members of the same field.
Addition and multiplication are associative: $a+(b+c)=(a+b)+c$ and $a(bc)=(ab)c$ .
Addition and multiplication are commutative: $a+b=b+a$ and $ab=ba$ .
Addition and multiplication are distributive: $a(b+c)=ab+ac$ and $ab=ba$ .
Existence of additive identity: there is an element in $F$ notated 0, sometimes called the sum of no numbers, such that $a+0=a$ .
Existence of multiplicative identity: there is an element in $F$ notated 1 different from 0, sometimes called the product of no numbers, such that $a\ 1=a$ .
Existence of additive inverse: there is an element in $F$ associated with $a$ notated $-a$ such that $a-a=0$ .
Existence of multiplicative inverse: there is an element in $F$ associated with $a$ (if $a$ is nonzero), notated $1/a$ such that $a\ 1/a=1$ .

These are called the field axioms. The field that we deal with, by far the most common one, is the real field. The set associated with the real field is the set of real numbers, and addition and multiplication are the familiar operations that everyone knows about.

Another example of a set that can form a field is the set of rational numbers, numbers which are expressible as the ratio of two integers. An example of a common set that doesn't form a field is the set of integers: there generally is no multiplicative inverse since the reciprocal of an integer generally is not an integer.

Note that when we say that an object is in $F$ , what is meant is that the object is a member of the set associated in the field and that it complies with the field axioms.

The Vector

Most non-mathematics students are taught that vectors are ordered groups ("tuples") of quantities. This is not complete, vectors are a lot more general than that. Informally, a vector is defined as an object that can be scaled and added with other vectors. This will be made more specific soon.

Examples of vectors:

The real numbers.
Pairs, triples, etc of real numbers.
Polynomials.
Most functions.

Examples of objects that are not vectors:

Members of the extended real numbers. Specifically, the infinity and negative infinity elements neither scale nor add.
The integers, at least when scaled by real numbers (since the result will not necessarily be an integer).

An interesting (read: confusing) fact to note is that, by the definition above, matrices and even tensors qualify as vectors since they can be scaled or added, even though these objects are considered generalizations of more "conventional" vectors, and calling a tensor a vector will lead to confusion.

The Vector Space

A vector space can be thought of as a generalization of a field.

Letting $F$ represent some field, a vector space $V$ over $F$ is a set of vectors bundled with two operations called vector addition and scalar multiplication, notated:

Vector addition: $u+v=w$ , where $u,v,w\in V$ .
Scalar multiplication: $au=v$ , where $u,v\in V$ and $a\in F$ .

The members of $V$ are called vectors, and the members of the field $F$ associated with $V$ are called scalars. Note that these operations imply closure (see the first field axiom), so that it does not have to be explicitly stated. Note also that this is essentially where a vector is defined: objects that can be added and scaled. The vector space must comply with the following axioms ( $u,v,$ and $w$ are in $V$ ; $a$ and $b$ are in $F$ ):

Addition is associative: $u+(v+w)=(u+v)+w$ .
Addition is commutative: $u+v=v+u$ .
Scalar multiplication is distributive over vector addition: $a(u+v)=au+av$ .
Scalar multiplication is distributive over field addition: $(a+b)u=au+bu$ .
Scalar and field multiplication are compatible: $a(bu)=(ab)u$ .
Existence of additive identity: there is an element in $V$ notated 0 such that $u+0=u$ .
Existence of additive inverse: there is an element in $V$ associated with $u$ notated $-u$ such that $u-u=0$ .
Existence of multiplicative identity: there is an element in $F$ notated 1 different from 0 such that $1v=u$ .

An example of a vector space is one where polynomials are vectors over the real field. An example of a space that is not a vector space is one where vectors are rational numbers over the real field, since scalar multiplication can lead to vectors that are not rational (implied closure under scalar multiplication is violated).

By analogy with linear functions, vectors are linear by nature, hence a vector space is also called a linear space. The name "linear vector space" is also used, but this is somewhat redundant since there is no such thing as a nonlinear vector space. It's now worth mentioning an important quantity called a linear combination (not part of the definition of a vector space, but important):

a_{1}u_{1}+a_{2}u_{2}+\cdots =\sum a_{i}u_{i}\,

where $a_{i}$ is a sequence of field members and $u_{i}$ is a sequence of vectors. The fact that a vector can be formed by a linear combination of other vectors is much of the essence of the vector field.

Note that a field over itself qualifies as a vector space. Fields of real numbers and other familiar objects are sometimes called spaces, since distance and other useful concepts apply.

The definition of a vector space is quite general. Note that, for example, there is no mention of any kind of product between vectors, nor is there a notion of the "length" of a vector. The vector space as defined above is very primitive, but it's a starting point: through various extensions, specific vector spaces can have a lot of nice properties and other features that make our playgrounds fun and comfortable. We'll discuss bases (plural of basis) and then take on some specific vector spaces.

The Basis

A nonempty subset $W$ of $V$ is called a linear subspace of $V$ if $W$ is itself a vector space. The requirement that $W$ be a vector space can be safely made specific by saying that $W$ is closed under vector addition and scalar multiplication, since the rest of the vector space properties are inherited.

The linear span of a set of $n$ vectors $u_{1},u_{2},\dots ,u_{n}$ in $V$ may then be defined as:

\mathrm {span} (u_{1},u_{2},\dots ,u_{n})=\bigcap _{\mathrm {All} \ a_{i}}a_{1}u_{1}+a_{2}u_{2}+\cdots +a_{n}u_{n}

Where $a_{1},a_{2},\dots a_{n}\in F$ . The span is the intersection over all choices of $a_{i}$ . This concept may be extended so that $n$ is not necessarily finite. The span of $V$ is the intersection of all of the linear subspaces of $V$ .

Now, think of what happens if a vector is removed from the set $u_{1},u_{2},\dots ,u_{n}$ . Does the span change? Not necessarily, it may be possible that the remaining vectors in the span are sufficient to "fill in" for the missing vector through linear combination of the remaining vectors.

Let $B$ be a subset of $V$ . If the span of $B$ is the same as the span of $V$ , and if removing a vector from $B$ necessarily changes its span, then the set of vectors $B$ is called a basis of $V$ , and the vectors of $B$ are called linearly independent. It can be proven that a basis can be constructed for every vector space.

Note that the basis is not unique. This obscure definition of a basis is convenient because it is very broad, it is worth understanding fully. An important property of a vector space is that it necessarily has a basis, and that any vector in the space may be written in terms of a linear combination of the members of the basis.

A more understandable (though less elemental) explanation is provided: for a vector space $V$ over the field $F$ , the vectors $u_{1},u_{2},\dots u_{n}\,$ form a basis of $V$ (where $u_{i}$ are in $V$ and the following $a_{i}$ are in $F$ ), satisfying the following properties:

The basis vectors are linearly independent: if

v=a_{1}u_{1}+a_{2}u_{2}+\cdots +a_{n}u_{n}=0

then

a_{1}=a_{2}=\cdots =a_{n}=0

without exception.

The basis vectors span $V$ : for some given $v$ in $V$ , it is possible to choose $a_{i}$ so that

a_{1}u_{1}+a_{2}u_{2}+\cdots +a_{n}u_{n}=v\,

The basis vectors of a vector space are usually notated with as $e_{i}$ .

Euclidean n-Space

As most students are familiar with Euclidean n-space, this section serves more of an example than anything else.

Let $\mathbb {R}$ be the field of real numbers, then the vector space $\mathbb {R} ^{n}$ over $\mathbb {R}$ is defined to be the space of n-tuples of members of $\mathbb {R}$ . In other words, more clearly:

If

x_{1},x_{2},\dots ,x_{n}{\mbox{ are }}\in \mathbb {R}

, then

\mathbf {x} =(x_{1},x_{2},\dots x_{n}){\mbox{ is }}\in \mathbb {R} ^{n}

, where

\mathbf {x}

are the vectors of the vector space

\mathbb {R} ^{n}

.

These vectors are called n-dimensional coordinates, and the vector space $\mathbb {R} ^{n}$ is called the real coordinate space; note that coordinates (unlike more general vectors) are often notated in boldface, or else with an arrow over the letter. They are also called spatial vectors, geometric vectors, just "vectors" if the context allows, and sometimes "points" as well, though some authors refuse to consider points as vectors, attributing a "fixed" sense to points so that points can't be added, scaled, or otherwise messed with. Part of the reason for this is that it allows one to say that some vector space is bound to a point, the point being called the origin.

The Euclidean n-space $E^{n}$ is the special real coordinate n-space $\mathbb {R} ^{n}$ with some additional structure defined which (finally) gives rise to the geometric notion of (specifically these) vectors.

To begin with, an inner product is first defined, notated with either angle braces or a dot:

\langle \mathbf {x} ,\mathbf {y} \rangle =\mathbf {x} \cdot \mathbf {y} =\sum _{i=1}^{n}x_{i}y_{i}=x_{1}y_{1}+x_{2}y_{2}+\cdots +x_{n}y_{n}\,

This quantity, which turns two vectors into a scalar (a member of $\mathbb {R}$ ) doesn't have a great deal of geometric meaning until some more structure is defined. In a coordinate space, the dot notation is favored, and this product is often called the "dot product", especially when $n=2$ or $n=3$ . The definition of this inner product qualifies $E^{n}$ as an inner product space.

Next comes the norm, in terms of the inner product:

\|\mathbf {x} \|=|\mathbf {x} |={\sqrt {\mathbf {x} \cdot \mathbf {x} }}\,

The notation involving single pipes around the letter x is common, again, when $n=2$ or $n=3$ , due to analogy with absolute value, for real and especially complex numbers. For a coordinate space, the norm is often called the length of $\mathbf {x}$ . This quickly leads to the notion of the distance between two vectors:

d(\mathbf {x} ,\mathbf {y} )=\|\mathbf {x-y} \|\,

Which is simply the length of the vector "from" $\mathbf {y}$ to $\mathbf {x}$ .

Finally, the angle $\theta$ between $\mathbf {x}$ and $\mathbf {y}$ is defined through, for $0\leq \theta \leq \pi$ ,

\mathbf {x} \cdot \mathbf {y} =\|\mathbf {x} \|\|\mathbf {y} \|\cos(\theta )\quad \Rightarrow \quad \theta =\cos ^{-1}\left({\frac {\mathbf {x} \cdot \mathbf {y} }{\|\mathbf {x} \|\|\mathbf {y} \|}}\right)\,

The motivation for this definition of angle, valid for any $n$ , comes from the fact that one can prove that the literal measurable angle between two vectors in $\mathbb {R} ^{2}$ satisfies the above (the norm is motivated similarly). Discussing these 2D angles and distances of course mandates making precise the notion of a vector as an "arrow" (ie, correlating vectors to things you can draw on a sheet of paper), but that would get involved and most are already subconsciously familiar with this and it's not the point of this introduction.

This completes the definition of $E^{n}$ . A thorough introduction to Euclidean space isn't very fitting in a text on Partial Differential equations, it is included so that one can see how a familiar vector space can be constructed ground-up through extensions called "structure".

Banach Spaces

Banach spaces are more general than Euclidean space, and they begin our departure from vectors as geometric objects into vectors as toys in the crazy world of functional analysis.

To be terse, a Banach space is defined as any complete normed vector space. The details follow.

The Inner Product

The inner product is a vector operation which results in a scalar. The vectors are members of a vector space $V$ , and the scalar is a member of the field $F$ associated with $V$ . A vector space on which an inner product is defined is said to be "equipped" with an inner product, and the space is an inner product space. The inner product of $u$ and $v$ is usually notated $\langle u,v\rangle$ .

A truly general definition of the inner product would be long. Normally, if the vectors are real or complex in nature (eg, complex coordinates or real valued functions), the inner product must satisfy the following axioms:

Distributive in the first variable: $\langle u+v,w\rangle =\langle u,w\rangle +\langle v,w\rangle$ .
Associative in the first variable: $\langle au,v\rangle =a\langle u,v\rangle$ .
Nondegeneracy and nonnegativity: $\langle u,u\rangle \geq 0$ , equality will hold only when $u=0$ .
Conjugate symmetry: $\langle u,v\rangle ={\overline {\langle v,u\rangle }}$ .

Note that if the space is real, the last requirement (the overbar indicates complex conjugation) simplifies to $\langle u,v\rangle =\langle v,u\rangle$ , and then the first two axioms extend to the second variable.

A desirable property of an inner product is some kind of orthogonality. Two nonzero vectors are said to be orthogonal only if their inner product is zero. Remember that we're talking about vectors in general, not specifically Euclidean.

Inner products are by no means unique, good definitions are what add quality to specific spaces. The Euclidean inner product, for example, defines the Euclidean distance and angle, quantities which form the foundation of Euclidean geometry.

The Norm

The norm is usually, though not universally, defined in terms of the inner product, which is why the inner product was discussed first (to be technically correct, a Banach space doesn't necessarily need to have an inner product). The norm is an operation (notated with double pipes) which takes one vector and generates one scalar, necessarily satisfying the following axioms:

Scalability: $\|au\|=a\|u\|$ .
The triangle inequality: $\|u+v\|\leq \|u\|+\|v\|$ .
Nonnegativity: $\|u\|\geq 0$ , equality only when $u=0$ .

The fact that $\|0\|=0$ can be proven from the first two statements above.

Definition requires that $\|u\|=0$ only when $u=0$ (compare this to the inner product, which can be zero even if fed nonzero vectors); if this condition is relaxed so that $\|u\|=0$ is possible for nonzero vectors, the resulting operation is called a seminorm.

The distance between two vectors $u$ and $v$ is a useful quantity which is defined in terms of the norm:

d(u,v)=\|u-v\|\,

The distance is often called the metric, and a vector space equipped with a distance is called a metric space.

Completeness

A Cauchy sequence shown in blue.

A sequence that is not Cauchy. The elements of the sequence fail to get close to each other as the sequence progresses.

As stated before, a Banach space is defined as a complete normed vector space. The norm was described above, so that all that is left to establish the definition of a Banach space is completeness.

Consider a sequence of vectors $u_{i}$ in a vector space $V$ . This sequence of vectors is called a Cauchy sequence if these vectors "tend" toward some "destination" vector, as shown in the pictures at right. Stated precisely, a sequence is a Cauchy sequence if it is always possible to make the distance $d(u_{m},u_{n})$ arbitrarily small by picking larger values of $m$ and $n$ .

The limit $u$ of a Cauchy sequence is:

u=\lim _{i\to \infty }u_{i}

A vector space $V$ is called complete if every Cauchy sequence has a limit that is also in $V$ . A Banach space is, finally, a vector space equipped with a norm that is complete. Note that completeness implies existence of distance, which means that every Banach space is a metric space.

An example of a vector space that is complete is Euclidean n-space. An example of a vector space that isn't complete is the space of rational numbers over rational numbers: it is possible to form a sequence of rational numbers which limit to an irrational number.

Hilbert spaces

Note that the inner product was defined above but not subsequently used in the definition of a Banach space. Indeed, a Banach space must have a norm but doesn't necessarily need to have an inner product. However, if the norm in a Banach space is defined through the inner product by

\|u\|={\sqrt {\langle u,u}}\rangle

then the resulting special Banach space is called a Hilbert space. Hilbert spaces are important in the study of partial differential equations (some relevance finally!) because many theorems and important results are valid only in Hilbert spaces.