Basics of Algebra, Topology, and Differential Calculus by Jean Gallier - HTML preview

/ Home / Mathematics (Academic) / Basics of Algebra, Topology, and Differential Calculus

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.







1 0 0 0  2

1 

 1





 





−3

4 

0 0 1 0

−3 −1

−4

and that



0 4 8 12

−8 

 4

12 −8

−3/4

0 5 10 −10

−3 −1

−4

LU = 

 









= P A.

1/4

0 0 0



 1





 

−6

6 



−3

4 

1/2

−1/5 1/3 1

0 0

Note that if one willing to overwrite the lower triangular part of the evolving matrix A,

one can store the evolving Λ there, since these entries will eventually be zero anyway! There

is also no need to save explicitly the permutation matrix P . One could instead record the

permutation steps in an extra column (record the vector (π(1), . . . , π(n)) corresponding to

the permutation π applied to the rows). We let the reader write such a bold and space-

efficient version of LU -decomposition!

As a corollary of Theorem 6.5(1), we can show the following result.

Proposition 6.6. If an invertible symmetric matrix A has an LU -decomposition, then A

has a factorization of the form

A = LDL ,

where L is a lower-triangular matrix whose diagonal entries are equal to 1, and where D

consists of the pivots. Furthermore, such a decomposition is unique.

Proof. If A has an LU -factorization, then it has an LDU factorization

A = LDU,

174

CHAPTER 6. GAUSSIAN ELIMINATION, LU, CHOLESKY, ECHELON FORM

where L is lower-triangular, U is upper-triangular, and the diagonal entries of both L and

U are equal to 1. Since A is symmetric, we have

LDU = A = A = U DL ,

with U lower-triangular and DL upper-triangular. By the uniqueness of LU -factorization

(part (1) of Theorem 6.5), we must have L = U

(and DU = DL ), thus U = L , as

claimed.

Remark: It can be shown that Gaussian elimination + back-substitution requires n3/3 +

O(n2) additions, n3/3 + O(n2) multiplications and n2/2 + O(n) divisions.

Let us now briefly comment on the choice of a pivot. Although theoretically, any pivot

can be chosen, the possibility of roundoff errors implies that it is not a good idea to pick

very small pivots. The following example illustrates this point. Consider the linear system

10−4x + y = 1

+ y = 2.

Since 10−4 is nonzero, it can be taken as pivot, and we get

10−4x +

(1 − 104)y = 2 − 104.

Thus, the exact solution is

104

104 − 2

x =

y =

104 − 1

However, if roundoff takes place on the fourth digit, then 104 − 1 = 9999 and 104 − 2 = 9998

will be rounded off both to 9990, and then, the solution is x = 0 and y = 1, very far from

the exact solution where x ≈ 1 and y ≈ 1. The problem is that we picked a very small pivot.

If instead we permute the equations, the pivot is 1, and after elimination, we get the system

x +

(1 − 10−4)y = 1 − 2 × 10−4.

This time, 1 − 10−4 = 0.9999 and 1 − 2 × 10−4 = 0.9998 are rounded off to 0.999 and the

solution is x = 1, y = 1, much closer to the exact solution.

To remedy this problem, one may use the strategy of partial pivoting. This consists of

choosing during step k (1 ≤ k ≤ n − 1) one of the entries ak such that

i k

|akik| = max |akpk|.

k≤p≤n

By maximizing the value of the pivot, we avoid dividing by undesirably small pivots.

6.3. GAUSSIAN ELIMINATION OF TRIDIAGONAL MATRICES

175

Remark: A matrix, A, is called strictly column diagonally dominant iff

|aj j| >

|ai j|, for j = 1, . . . , n

i=1, i=j

(resp. strictly row diagonally dominant iff

|ai i| >

|ai j|, for i = 1, . . . , n.)

j=1, j=i

It has been known for a long time (before 1900, say by Hadamard) that if a matrix, A,

is strictly column diagonally dominant (resp. strictly row diagonally dominant), then it is

invertible. (This is a good exercise, try it!) It can also be shown that if A is strictly column

diagonally dominant, then Gaussian elimination with partial pivoting does not actually re-

quire pivoting (See Problem 21.6 in Trefethen and Bau [106], or Question 2.19 in Demmel

[25]).

Another strategy, called complete pivoting, consists in choosing some entry akij, where

k ≤ i, j ≤ n, such that

|akij| = max |akpq|.

k≤p,q≤n

However, in this method, if the chosen pivot is not in column k, it is also necessary to

permute columns. This is achieved by multiplying on the right by a permutation matrix.

However, complete pivoting tends to be too expensive in practice, and partial pivoting is the

method of choice.

A special case where the LU -factorization is particularly efficient is the case of tridiagonal

matrices, which we now consider.

6.3

Gaussian Elimination of Tridiagonal Matrices

Consider the tridiagonal matrix

 b



a2















A =



. .

. ..

 .











n−2

bn−2 cn−2









n−1

bn−1 cn−1

Define the sequence

δ0 = 1,

δ1 = b1,

δk = bkδk−1 − akck−1δk−2, 2 ≤ k ≤ n.

176

CHAPTER 6. GAUSSIAN ELIMINATION, LU, CHOLESKY, ECHELON FORM

Proposition 6.7. If A is the tridiagonal matrix above, then δk = det(A[1..k, 1..k]), for

k = 1, . . . , n.

Proof. By expanding det(A[1..k, 1..k]) with respect to its last row, the proposition follows

by induction on k.

Theorem 6.8. If A is the tridiagonal matrix above and δk = 0 for k = 1, . . . , n, then A has

the following LU -factorization:

 δ







δ0





a

 

δ2





2 δ

 





 





δ1

 





 





 



A =



 

δ2

 .



. .

 





 

. .





 





−3

 

δn−1





n−1 δ

 

cn−1 



n−2

 





 

−2



n−2



n 

δn−1

Proof. Since δk = det(A[1..k, 1..k]) = 0 for k = 1, . . . , n, by Theorem 6.5 (and Proposition

6.2), we know that A has a unique LU -factorization. Therefore, it suffices to check that the

proposed factorization works. We easily check that

(LU )k k+1 = ck,

1 ≤ k ≤ n − 1

(LU )k k−1 = ak, 2 ≤ k ≤ n

(LU )k l = 0,

|k − l| ≥ 2

(LU )

1 1

= b

(LU )

kck−1δk−2 + δk

k k

= b

2 ≤ k ≤ n,

k−1

since δk = bkδk−1 − akck−1δk−2.

It follows that there is a simple method to solve a linear system, Ax = d, where A is

tridiagonal (and δk = 0 for k = 1, . . . , n). For this, it is convenient to “squeeze” the diagonal

matrix, ∆, defined such that ∆k k = δk/δk−1, into the factorization so that A = (L∆)(∆−1U),

and if we let

k−1

1 =

2 ≤ k ≤ n − 1, z

= b

k = ck

n =

n − anzn−1,

δk

δn−1

6.3. GAUSSIAN ELIMINATION OF TRIDIAGONAL MATRICES

177

A = (L∆)(∆−1U ) is written as

1 z











 c1

 













 



 a

 





 





 





 



A = 











. ..







 





. .

. ..

 





 





n−1





n−2





an−1

 





zn−1

 









n−1











As a consequence, the system Ax = d can be solved by constructing three sequences: First,

the sequence

1 =

k = 2, . . . , n − 1, z

k =

n = bn − anzn−1,

bk − akzk−1

corresponding to the recurrence δk = bkδk−1 − akck−1δk−2 and obtained by dividing both

sides of this equation by δk−1, next

k − akwk−1

1 =

k = 2, . . . , n,

k =

bk − akzk−1

corresponding to solving the system L∆w = d, and finally

xn = wn,

xk = wk − zkxk+1, k = n − 1, n − 2, . . . , 1,

corresponding to solving the system ∆−1U x = w.

Remark: It can be verified that this requires 3(n − 1) additions, 3(n − 1) multiplications,

and 2n divisions, a total of 8n − 6 operations, which is much less that the O(2n3/3) required

by Gaussian elimination in general.

We now consider the special case of symmetric positive definite matrices (SPD matrices).

Recall that an n × n symmetric matrix, A, is positive definite iff

x Ax > 0 for all x ∈ n

R with x = 0.

Equivalently, A is symmetric positive definite iff all its eigenvalues are strictly positive. The

following facts about a symmetric positive definite matrice, A, are easily established (some

left as an exercise):

178

CHAPTER 6. GAUSSIAN ELIMINATION, LU, CHOLESKY, ECHELON FORM

(1) The matrix A is invertible. (Indeed, if Ax = 0, then x Ax = 0, which implies x = 0.)

(2) We have ai i > 0 for i = 1, . . . , n. (Just observe that for x = ei, the ith canonical basis

vector of

R , we have ei Aei = ai i > 0.)

(3) For every n × n invertible matrix, Z, the matrix Z AZ is symmetric positive definite

iff A is symmetric positive definite.

Next, we prove that a symmetric positive definite matrix has a special LU -factorization

of the form A = BB , where B is a lower-triangular matrix whose diagonal elements are

strictly positive. This is the Cholesky factorization.

6.4

SPD Matrices and the Cholesky Decomposition

First, we note that a symmetric positive definite matrix satisfies the condition of Proposition

6.2.

Proposition 6.9. If A is a symmetric positive definite matrix, then A[1..k, 1..k] is symmetric

positive definite, and thus, invertible, for k = 1, . . . , n.

Proof. Since A is symmetric, each A[1..k, 1..k] is also symmetric. If w ∈ k

R , with 1 ≤ k ≤ n,

we let x ∈ n

be the vector with xi = wi for i = 1, . . . , k and xi = 0 for i = k + 1, . . . , n.

Now, since A is symmetric positive definite, we have x Ax > 0 for all x ∈ n

with x = 0.

This holds in particular for all vectors x obtained from nonzero vectors w ∈ k

R as defined

earlier, and clearly

x Ax = w A[1..k, 1..k] w,

which implies that A[1..k, 1..k] is positive definite Thus, A[1..k, 1..k] is also invertible.

Proposition 6.9 can be strengthened as follows: A symmetric matrix A is positive definite

iff det(A[1..k, 1..k]) > 0 for k = 1, . . . , n.

The above fact is known as Sylvester’s criterion. We will prove it after establishing the

Cholseky factorization.

Let A be a symmetric positive definite matrix and write

A =

1 1

√

Since A is symmetric positive definite, a1 1 > 0, and we can compute α =

a1 1. The trick is

that we can factor A uniquely as

α W /α

A =

1 1

W/α I

0 C − W W /a1 1

i.e., as A = B1A1B1 , where B1 is lower-triangular with positive diagonal entries. Thus, B1

is invertible, and by fact (3) above, A1 is also symmetric positive definite.

6.4. SPD MATRICES AND THE CHOLESKY DECOMPOSITION

179

Theorem 6.10. (Cholesky Factorization) Let A be a symmetric positive definite matrix.

Then, there is some lower-triangular matrix, B, so that A = BB . Furthermore, B can be

chosen so that its diagonal elements are strictly positive, in which case, B is unique.

Proof. We proceed by induction on k. For k = 1, we must have a1 1 > 0, and if we let

√

α =

a1 1 and B = (α), the theorem holds trivially. If k ≥ 2, as we explained above, again

we must have a1 1 > 0, and we can write

α W /α

A =

1 1

= B

W/α I

0 C − W W /a

1A1B1 ,

1 1

√

where α =

a1 1, the matrix B1 is invertible and

A1 = 0 C − WW /a11

is symmetric positive definite. However, this implies that C − W W /a1 1 is also symmetric

positive definite (consider x A

1x for every x ∈ R

with x = 0 and x1 = 0). Thus, we can

apply the induction hypothesis to C − W W /a1 1, and we find a unique lower-triangular

matrix, L, with positive diagonal entries, so that

C − W W /a1 1 = LL .

But then, we get

α W /α

A =

W/α I

0 C − W W /a1 1

α W /α

W/α I

0 LL

1 0

α W /α

W/α I

0 L

α W /α

W/α L

Therefore, if we let

B =

W/α L

we have a unique lower-triangular matrix with positive diagonal entries and A = BB .

The proof of Theorem 6.10 immediately yields an algorithm to compute B from A. For

j = 1, . . . , n,

j−1

1/2

bj j =

aj j −

b2jk

k=1

180

CHAPTER 6. GAUSSIAN ELIMINATION, LU, CHOLESKY, ECHELON FORM

and for i = j + 1, . . . , n,

j−1

bi j =

ai j −

bi kbj k /bj j.

k=1

The above formulae are used to compute the jth column of B from top-down, using the first

j − 1 columns of B previously computed, and the matrix A.

The Cholesky factorization can be used to solve linear systems, Ax = b, where A is

symmetric positive definite: Solve the two systems Bw = b and B x = w.

Remark: It can be shown that this methods requires n3/6 + O(n2) additions, n3/6 + O(n2)

multiplications, n2/2+O(n) divisions, and O(n) square root extractions. Thus, the Cholesky

method requires half of the number of operations required by Gaussian elimination (since

Gaussian elimination requires n3/3 + O(n2) additions, n3/3 + O(n2) multiplications, and

n2/2 + O(n) divisions). It also requires half of the space (only B is needed, as opposed to

both L and U ). Furthermore, it can be shown that Cholesky’s method is numerically stable.

Remark: If A = BB , where B is any invertible matrix, then A is symmetric positive

definite.

Proof. Obviously, BB

is symmetric, and since B is invertible, B is invertible, and from

x Ax = x BB x = (B x) B x,

it is clear that x Ax > 0 if x = 0.

We now give three more criteria for a symmetric matrix to be positive definite.

Proposition 6.11. Let A be any n × n symmetric matrix. The following conditions are

equivalent:

(a) A is positive definite.

(b) All principal minors of A are positive; that is: det(A[1..k, 1..k]) > 0 for k = 1, . . . , n

(Sylvester’s criterion).

(d) A has an LDL -factorization and all pivots in D are positive.

Proof. By Proposition 6.9, if A is symmetric positive definite, then each matrix A[1..k, 1..k] is

symmetric positive definite for k = 1, . . . , n. By the Cholsesky decomposition, A[1..k, 1..k] =

Q Q for some invertible matrix Q, so det(A[1..k, 1..k]) = det(Q)2 > 0. This shows that (a)

implies (b).

6.5. REDUCED ROW ECHELON FORM

181

If det(A[1..k, 1..k]) > 0 for k = 1, . . . , n, then each A[1..k, 1..k] is invertible. By Proposi-

tion 6.2, the matrix A has an LU -factorization, and since the pivots πk are given by

a



11 = det(A[1..1, 1..1])

if k = 1

πk =

det(A[1..k, 1..k])

if k = 2, . . . , n,

 det(A[1..k − 1, 1..k − 1])

we see that πk > 0 for k = 1, . . . , n. Thus (b) implies (c).

Assume A has an LU -factorization and that the pivots are all positive. Since A is

symmetric, this implies that A has a factorization of the form

A = LDL ,

with L lower-triangular with 1’s on its diagonal, and where D is a diagonal matrix with

positive entries on the diagonal (the pivots). This