Basics of Algebra, Topology, and Differential Calculus by Jean Gallier - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Chapter 3

Matrices and Linear Maps

3.1

Matrices

Proposition 2.13 shows that given two vector spaces E and F and a basis (uj)j∈J of E,

every linear map f : E → F is uniquely determined by the family (f(uj))j∈J of the images

under f of the vectors in the basis (uj)j∈J. Thus, in particular, taking F = K(J), we get an

isomorphism between any vector space E of dimension |J| and K(J). If J = {1, . . . , n}, a

vector space E of dimension n is isomorphic to the vector space Kn. If we also have a basis

(vi)i∈I of F , then every vector f(uj) can be written in a unique way as

f (uj) =

ai jvi,

i∈I

where j ∈ J, for a family of scalars (ai j)i∈I. Thus, with respect to the two bases (uj)j∈J

of E and (vi)i∈I of F , the linear map f is completely determined by a possibly infinite

“I × J-matrix” M(f) = (ai j)i∈I, j∈J.

Remark: Note that we intentionally assigned the index set J to the basis (uj)j∈J of E,

and the index I to the basis (vi)i∈I of F , so that the rows of the matrix M(f) associated

with f : E → F are indexed by I, and the columns of the matrix M(f) are indexed by J.

Obviously, this causes a mildly unpleasant reversal. If we had considered the bases (ui)i∈I of

E and (vj)j∈J of F , we would obtain a J × I-matrix M(f) = (aj i)j∈J, i∈I. No matter what

we do, there will be a reversal! We decided to stick to the bases (uj)j∈J of E and (vi)i∈I of

F , so that we get an I × J-matrix M(f), knowing that we may occasionally suffer from this

decision!

When I and J are finite, and say, when |I| = m and |J| = n, the linear map f is

determined by the matrix M (f ) whose entries in the j-th column are the components of the

45

46

CHAPTER 3. MATRICES AND LINEAR MAPS

vector f (uj) over the basis (v1, . . . , vm), that is, the matrix

 a

1 1

a1 2 . . . a1 n

 a2 1

a2 2 . . . a2 n 

M (f ) =  .

.

.

. 

..

..

. .

.. 

am 1 am 2 . . . am n

whose entry on row i and column j is ai j (1 ≤ i ≤ m, 1 ≤ j ≤ n).

We will now show that when E and F have finite dimension, linear maps can be very

conveniently represented by matrices, and that composition of linear maps corresponds to

matrix multiplication. We will follow rather closely an elegant presentation method due to

Emil Artin.

Let E and F be two vector spaces, and assume that E has a finite basis (u1, . . . , un) and

that F has a finite basis (v1, . . . , vm). Recall that we have shown that every vector x ∈ E

can be written in a unique way as

x = x1u1 + · · · + xnun,

and similarly every vector y ∈ F can be written in a unique way as

y = y1v1 + · · · + ymvm.

Let f : E → F be a linear map between E and F . Then, for every x = x1u1 + · · · + xnun in

E, by linearity, we have

f (x) = x1f (u1) + · · · + xnf(un).

Let

f (uj) = a1 jv1 + · · · + am jvm,

or more concisely,

m

f (uj) =

ai jvi,

i=1

for every j, 1 ≤ j ≤ n. This can be expressed by writing the coefficients a1j, a2j, . . . , amj of

f (uj) over the basis (v1, . . . , vm), as the jth column of a matrix, as shown below:

f (u1) f (u2) . . . f(un)

v 

1

a11

a12

. . .

a1n

v2  a21

a22

. . .

a2n 

. 

.

.

.

.

.

.  ..

..

. .

.. 

vm

am1

am2

. . .

amn

Then, substituting the right-hand side of each f (uj) into the expression for f (x), we get

m

m

f (x) = x1(

ai 1vi) + · · · + xn(

ai nvi),

i=1

i=1

3.1. MATRICES

47

which, by regrouping terms to obtain a linear combination of the vi, yields

n

n

f (x) = (

a1 jxj)v1 + · · · + (

am jxj)vm.

j=1

j=1

Thus, letting f (x) = y = y1v1 + · · · + ymvm, we have

n

yi =

ai jxj

(1)

j=1

for all i, 1 ≤ i ≤ m.

To make things more concrete, let us treat the case where n = 3 and m = 2. In this case,

f (u1) = a11v1 + a21v2

f (u2) = a12v1 + a22v2

f (u3) = a13v1 + a23v2,

which in matrix form is expressed by

f (u1) f (u2) f (u3)

v1

a11

a12

a13

,

v2

a21

a22

a23

and for any x = x1u1 + x2u2 + x3u3, we have

f (x) = f (x1u1 + x2u2 + x3u3)

= x1f (u1) + x2f (u2) + x3f (u3)

= x1(a11v1 + a21v2) + x2(a12v1 + a22v2) + x3(a13v1 + a23v2)

= (a11x1 + a12x2 + a13x3)v1 + (a21x1 + a22x2 + a23x3)v2.

Consequently, since

y = y1v1 + y2v2,

we have

y1 = a11x1 + a12x2 + a13x3

y2 = a21x1 + a22x2 + a23x3.

This agrees with the matrix equation

x 

y

1

1

a

=

11

a12 a13

x

y

2 .

2

a21 a22 a23

x3

48

CHAPTER 3. MATRICES AND LINEAR MAPS

Let us now consider how the composition of linear maps is expressed in terms of bases.

Let E, F , and G, be three vectors spaces with respective bases (u1, . . . , up) for E,

(v1, . . . , vn) for F , and (w1, . . . , wm) for G. Let g : E → F and f : F → G be linear maps.

As explained earlier, g : E → F is determined by the images of the basis vectors uj, and

f : F → G is determined by the images of the basis vectors vk. We would like to understand

how f ◦ g : E → G is determined by the images of the basis vectors uj.

Remark: Note that we are considering linear maps g : E → F and f : F → G, instead

of f : E → F and g : F → G, which yields the composition f ◦ g : E → G instead of

g ◦ f : E → G. Our perhaps unusual choice is motivated by the fact that if f is represented

by a matrix M (f ) = (ai k) and g is represented by a matrix M(g) = (bk j), then f ◦g : E → G

is represented by the product AB of the matrices A and B. If we had adopted the other

choice where f : E → F and g : F → G, then g ◦ f : E → G would be represented by the

product BA. Personally, we find it easier to remember the formula for the entry in row i and

column of j of the product of two matrices when this product is written by AB, rather than

BA. Obviously, this is a matter of taste! We will have to live with our perhaps unorthodox

choice.

Thus, let

m

f (vk) =

ai kwi,

i=1

for every k, 1 ≤ k ≤ n, and let

n

g(uj) =

bk jvk,

k=1

for every j, 1 ≤ j ≤ p; in matrix form, we have

f (v1) f (v2) . . . f(vn)

w 

1

a11

a12

. . .

a1n

w2  a21

a22

. . .

a2n 

. 

.

.

.

.

.

.  ..

..

. .

.. 

wm

am1

am2

. . .

amn

and

g(u1) g(u2) . . . g(up)

v 

1

b11

b12

. . .

b1p

v2  b21

b22

. . .

b2p 

. 

.

.

.

.

.

.  ..

..

. .

.. 

vn

bn1

bn2

. . .

bnp

3.1. MATRICES

49

By previous considerations, for every

x = x1u1 + · · · + xpup,

letting g(x) = y = y1v1 + · · · + ynvn, we have

p

yk =

bk jxj

(2)

j=1

for all k, 1 ≤ k ≤ n, and for every

y = y1v1 + · · · + ynvn,

letting f (y) = z = z1w1 + · · · + zmwm, we have

n

zi =

ai kyk

(3)

k=1

for all i, 1 ≤ i ≤ m. Then, if y = g(x) and z = f(y), we have z = f(g(x)), and in view of

(2) and (3), we have

n

p

zi =

ai k(

bk jxj)

k=1

j=1

n

p

=

ai kbk jxj

k=1 j=1

p

n

=

ai kbk jxj

j=1 k=1

p

n

=

(

ai kbk j)xj.

j=1 k=1

Thus, defining ci j such that

n

ci j =

ai kbk j,

k=1

for 1 ≤ i ≤ m, and 1 ≤ j ≤ p, we have

p

zi =

ci jxj

(4)

j=1

Identity (4) suggests defining a multiplication operation on matrices, and we proceed to

do so. We have the following definitions.

50

CHAPTER 3. MATRICES AND LINEAR MAPS

Definition 3.1. Given a field K, an m × n-matrix is a family (ai j)1≤i≤m, 1≤j≤n of scalars in

K, represented as an array

 a

1 1

a1 2 . . . a1 n

 a2 1

a2 2 . . . a2 n 

.

.

.

. 

..

..

. .

.. 

am 1 am 2 . . . am n

In the special case where m = 1, we have a row vector , represented as

(a1 1 · · · a1 n)

and in the special case where n = 1, we have a column vector , represented as

 a 

1 1

.

.. 

am 1

In these last two cases, we usually omit the constant index 1 (first index in case of a row,

second index in case of a column). The set of all m × n-matrices is denoted by Mm,n(K)

or Mm,n. An n × n-matrix is called a square matrix of dimension n. The set of all square

matrices of dimension n is denoted by Mn(K), or Mn.

Remark: As defined, a matrix A = (ai j)1≤i≤m, 1≤j≤n is a family, that is, a function from

{1, 2, . . . , m} × {1, 2, . . . , n} to K. As such, there is no reason to assume an ordering on

the indices. Thus, the matrix A can be represented in many different ways as an array, by

adopting different orders for the rows or the columns. However, it is customary (and usually

convenient) to assume the natural ordering on the sets {1, 2, . . . , m} and {1, 2, . . . , n}, and

to represent A as an array according to this ordering of the rows and columns.

We also define some operations on matrices as follows.

Definition 3.2. Given two m × n matrices A = (ai j) and B = (bi j), we define their sum

A + B as the matrix C = (ci j) such that ci j = ai j + bi j; that is,

 a

1 1

a1 2 . . . a1 n

b1 1

b1 2 . . . b1 n

 a2 1

a2 2 . . . a2 n 

 b2 1

b2 2 . . . b2 n 

.

.

.

.  +  .

.

.

. 

..

..

. .

..   ..

..

. .

.. 

am 1 am 2 . . . am n

bm 1 bm 2 . . . bm n

 a

1 1 + b1 1

a1 2 + b1 2

. . .

a1 n + b1 n

 a2 1 + b2 1

a2 2 + b2 2

. . .

a2 n + b2 n 

= 

.

.

.

.

 .

..

..

. .

..

am 1 + bm 1 am 2 + bm 2 . . . am n + bm n

3.1. MATRICES

51

For any matrix A = (ai j), we let −A be the matrix (−ai j). Given a scalar λ ∈ K, we define

the matrix λA as the matrix C = (ci j) such that ci j = λai j; that is

 a

1 1

a1 2 . . . a1 n

λa1 1

λa1 2 . . . λa1 n

 a2 1

a2 2 . . . a2 n 

 λa2 1

λa2 2 . . . λa2 n 

λ  .

.

.

.  =  .

.

.

.

 .

..

..

. .

..   ..

..

. .

.. 

am 1 am 2 . . . am n

λam 1 λam 2 . . . λam n

Given an m × n matrices A = (ai k) and an n × p matrices B = (bk j), we define their product

AB as the m × p matrix C = (ci j) such that

n

ci j =

ai kbk j,

k=1

for 1 ≤ i ≤ m, and 1 ≤ j ≤ p. In the product AB = C shown below

 a

 

1 1

a1 2 . . . a1 n

b1 1 b1 2 . . . b1 p

c1 1

c1 2 . . . c1 p

 a2 1

a2 2 . . . a2 n  b2 1 b2 2 . . . b2 p

 c2 1

c2 2 . . . c2 p 

.

.

.

.   .

.

.

.  =  .

.

.

. 

..

..

. .

..   ..

..

. .

..   ..

..

. .

.. 

 

am 1 am 2 . . . am n

bn 1 bn 2 . . . bn p

cm 1 cm 2 . . . cm p

note that the entry of index i and j of the matrix AB obtained by multiplying the matrices

A and B can be identified with the product of the row matrix corresponding to the i-th row

of A with the column matrix corresponding to the j-column of B:

b 

1 j

n

(a

.

. 

i 1 · · · ai n)

.

=

ai kbk j.

b

k=1

n j

The square matrix In of dimension n containing 1 on the diagonal and 0 everywhere else

is called the identity matrix . It is denoted as

1 0 . . . 0 

0

1 . . . 0 

 .

.

.

. 

 ..

..

. . .. 

0 0 . . . 1

Given an m × n matrix A = (ai j), its transpose A = (aj i), is the n × m-matrix such

that aj i = ai j, for all i, 1 ≤ i ≤ m, and all j, 1 ≤ j ≤ n.

The transpose of a matrix A is sometimes denoted by At, or even by tA. Note that the

transpose A of a matrix A has the property that the j-th row of A is the j-th column of

A. In other words, transposition exchanges the rows and the columns of a matrix.

52

CHAPTER 3. MATRICES AND LINEAR MAPS

The following observation will be useful later on when we discuss the SVD. Given any

m × n matrix A and any n × p matrix B, if we denote the columns of A by A1, . . . , An and

the rows of B by B1, . . . , Bn, then we have

AB = A1B1 + · · · + AnBn.

For every square matrix A of dimension n, it is immediately verified that AIn = InA = A.

If a matrix B such that AB = BA = In exists, then it is unique, and it is called the inverse

of A. The matrix B is also denoted by A−1. An invertible matrix is also called a nonsingular

matrix, and a matrix that is not invertible is called a singular matrix.

Proposition 2.16 shows that if a square matrix A has a left inverse, that is a matrix B

such that BA = I, or a right inverse, that is a matrix C such that AC = I, then A is actually

invertible; so B = A−1 and C = A−1. These facts also follow from Proposition 4.14.

It is immediately verified that the set Mm,n(K) of m × n matrices is a vector space under

addition of matrices and multiplication of a matrix by a scalar. Consider the m × n-matrices

Ei,j = (eh k), defined such that ei j = 1, and eh k = 0, if h = i or k = j. It is clear that every

matrix A = (ai j) ∈ Mm,n(K) can be written in a unique way as

m

n

A =

ai jEi,j.

i=1 j=1

Thus, the family (Ei,j)1≤i≤m,1≤j≤n is a basis of the vector space Mm,n(K), which has dimen-

sion mn.

Remark: Definition 3.1 and Definition 3.2 also make perfect sense when K is a (commuta-

tive) ring rather than a field. In this more general setting, the framework of vector spaces

is too narrow, but we can consider structures over a commutative ring A satisfying all the

axioms of Definition 2.9. Such structures are called modules. The theory of modules is

(much) more complicated than that of vector spaces. For example, modules do not always

have a basis, and other properties holding for vector spaces usually fail for modules. When

a module has a basis, it is called a free module. For example, when A is a commutative

ring, the structure An is a module such that the vectors ei, with (ei)i = 1 and (ei)j = 0 for

j = i, form a basis of An. Many properties of vector spaces still hold for An. Thus, An is a

free module. As another example, when A is a commutative ring, Mm,n(A) is a free module

with basis (Ei,j)1≤i≤m,1≤j≤n. Polynomials over a commutative ring also form a free module

of infinite dimension.

Square matrices provide a natural example of a noncommutative ring with zero divisors.

Example 3.1. For example, letting A, B be the 2 × 2-matrices

1 0

0 0

A =

,

B =

,

0 0

1 0

3.1. MATRICES

53

then

1 0

0 0

0 0

AB =

=

,

0 0

1 0

0 0

and

0 0

1 0

0 0

BA =

=

.

1 0

0 0

1 0

We now formalize the representation of linear maps by matrices.

Definition 3.3. Let E and F be two vector spaces, and let (u1, . . . , un) be a basis for E,

and (v1, . . . , vm) be a basis for F . Each vector x ∈ E expressed in the basis (u1, . . . , un) as

x = x1u1 + · · · + xnun is represented by the column matrix

x 

1

M (x) =

.

 .. 

xn

and similarly for each vector y ∈ F expressed in the basis (v1, . . . , vm).

Every linear map f : E → F is represented by the matrix M(f) = (ai j), where ai j is the

i-th component of the vector f (uj) over the basis (v1, . . . , vm), i.e., where

m

f (uj) =

ai jvi,

for every j, 1 ≤ j ≤ n.

i=1

The coefficients a1j, a2j, . . . , amj of f (uj) over the basis (v1, . . . , vm) form the jth column of

the matrix M (f ) shown below:

f (u1) f (u2) . . . f(un)

v 

1

a11

a12

. . .

a1n

v2  a21

a22

. . .

a2n 

. 

.

.

.

.

.

.

.  ..

..

. .

.. 

vm

am1

am2

. . .

amn

The matrix M (f ) associated with the linear map f : E → F is called the matrix of f with

respect to the bases (u1, . . . , un) and (v1, . . . , vm). When E = F and the basis (v1, . . . , vm)

is identical to the basis (u1, . . . , un) of E, the matrix M(f ) associated with f : E → E (as

above) is called the matrix of f with respect to the base (u1, . . . , un).

Remark: As in the remark after Definition 3.1, there is no reason to assume that the vectors

in the bases (u1, . . . , un) and (v1, . . . , vm) are ordered in any particular way. However, it is

often convenient to assume the natural ordering. When this is so, authors sometimes refer

54

CHAPTER 3. MATRICES AND LINEAR MAPS

to the matrix M (f ) as the matrix of f with respect to the ordered bases (u1, . . . , un) and

(v1, . . . , vm).

Then, given a linear map f : E → F represented by the matrix M(f) = (ai j) w.r.t. the

bases (u1, . . . , un) and (v1, . . . , vm), by equations (1) and the definition of matrix multipli-

cation, the equation y = f (x) correspond to the matrix equation M (y) = M (f )M (x), that

is,

 y 

 

1

a1 1 . . . a1 n

x1

.

.

.

.

.

 ..  =  ..