=
,
1 0 0 0 2
3
2
1
1
2
−3
4
0 0 1 0
−3 −1
1
−4
2
3
2
1
and that
1
0
0
0 4 8 12
−8
4
8
12 −8
−3/4
1
0
0
0 5 10 −10
−3 −1
1
−4
LU =
=
= P A.
1/4
0
1
0 0 0
1
2
−6
6
−3
4
1/2
−1/5 1/3 1
0 0
0
1
2
3
2
1
Note that if one willing to overwrite the lower triangular part of the evolving matrix A,
one can store the evolving Λ there, since these entries will eventually be zero anyway! There
is also no need to save explicitly the permutation matrix P . One could instead record the
permutation steps in an extra column (record the vector (π(1), . . . , π(n)) corresponding to
the permutation π applied to the rows). We let the reader write such a bold and space-
efficient version of LU -decomposition!
As a corollary of Theorem 6.5(1), we can show the following result.
Proposition 6.6. If an invertible symmetric matrix A has an LU -decomposition, then A
has a factorization of the form
A = LDL ,
where L is a lower-triangular matrix whose diagonal entries are equal to 1, and where D
consists of the pivots. Furthermore, such a decomposition is unique.
Proof. If A has an LU -factorization, then it has an LDU factorization
A = LDU,
174
CHAPTER 6. GAUSSIAN ELIMINATION, LU, CHOLESKY, ECHELON FORM
where L is lower-triangular, U is upper-triangular, and the diagonal entries of both L and
U are equal to 1. Since A is symmetric, we have
LDU = A = A = U DL ,
with U lower-triangular and DL upper-triangular. By the uniqueness of LU -factorization
(part (1) of Theorem 6.5), we must have L = U
(and DU = DL ), thus U = L , as
claimed.
Remark: It can be shown that Gaussian elimination + back-substitution requires n3/3 +
O(n2) additions, n3/3 + O(n2) multiplications and n2/2 + O(n) divisions.
Let us now briefly comment on the choice of a pivot. Although theoretically, any pivot
can be chosen, the possibility of roundoff errors implies that it is not a good idea to pick
very small pivots. The following example illustrates this point. Consider the linear system
10−4x + y = 1
x
+ y = 2.
Since 10−4 is nonzero, it can be taken as pivot, and we get
10−4x +
y
=
1
(1 − 104)y = 2 − 104.
Thus, the exact solution is
104
104 − 2
x =
,
y =
.
104 − 1
104 − 1
However, if roundoff takes place on the fourth digit, then 104 − 1 = 9999 and 104 − 2 = 9998
will be rounded off both to 9990, and then, the solution is x = 0 and y = 1, very far from
the exact solution where x ≈ 1 and y ≈ 1. The problem is that we picked a very small pivot.
If instead we permute the equations, the pivot is 1, and after elimination, we get the system
x +
y
=
2
(1 − 10−4)y = 1 − 2 × 10−4.
This time, 1 − 10−4 = 0.9999 and 1 − 2 × 10−4 = 0.9998 are rounded off to 0.999 and the
solution is x = 1, y = 1, much closer to the exact solution.
To remedy this problem, one may use the strategy of partial pivoting. This consists of
choosing during step k (1 ≤ k ≤ n − 1) one of the entries ak such that
i k
|akik| = max |akpk|.
k≤p≤n
By maximizing the value of the pivot, we avoid dividing by undesirably small pivots.
6.3. GAUSSIAN ELIMINATION OF TRIDIAGONAL MATRICES
175
Remark: A matrix, A, is called strictly column diagonally dominant iff
n
|aj j| >
|ai j|, for j = 1, . . . , n
i=1, i=j
(resp. strictly row diagonally dominant iff
n
|ai i| >
|ai j|, for i = 1, . . . , n.)
j=1, j=i
It has been known for a long time (before 1900, say by Hadamard) that if a matrix, A,
is strictly column diagonally dominant (resp. strictly row diagonally dominant), then it is
invertible. (This is a good exercise, try it!) It can also be shown that if A is strictly column
diagonally dominant, then Gaussian elimination with partial pivoting does not actually re-
quire pivoting (See Problem 21.6 in Trefethen and Bau [106], or Question 2.19 in Demmel
[25]).
Another strategy, called complete pivoting, consists in choosing some entry akij, where
k ≤ i, j ≤ n, such that
|akij| = max |akpq|.
k≤p,q≤n
However, in this method, if the chosen pivot is not in column k, it is also necessary to
permute columns. This is achieved by multiplying on the right by a permutation matrix.
However, complete pivoting tends to be too expensive in practice, and partial pivoting is the
method of choice.
A special case where the LU -factorization is particularly efficient is the case of tridiagonal
matrices, which we now consider.
6.3
Gaussian Elimination of Tridiagonal Matrices
Consider the tridiagonal matrix
b
1
c1
a2
b2
c2
a3
b3
c3
A =
.
. .
. ..
. ..
.
a
n−2
bn−2 cn−2
a
n−1
bn−1 cn−1
an
bn
Define the sequence
δ0 = 1,
δ1 = b1,
δk = bkδk−1 − akck−1δk−2, 2 ≤ k ≤ n.
176
CHAPTER 6. GAUSSIAN ELIMINATION, LU, CHOLESKY, ECHELON FORM
Proposition 6.7. If A is the tridiagonal matrix above, then δk = det(A[1..k, 1..k]), for
k = 1, . . . , n.
Proof. By expanding det(A[1..k, 1..k]) with respect to its last row, the proposition follows
by induction on k.
Theorem 6.8. If A is the tridiagonal matrix above and δk = 0 for k = 1, . . . , n, then A has
the following LU -factorization:
δ
1
1
c1
δ
δ0
0
a
1
δ2
2 δ
c2
1
δ
1
δ1
δ
a3
1
3
δ
c3
A =
2
.
.
δ2
.
. .
. .
.
.
. .
. .
δ
n
a
−3
1
δn−1
n−1 δ
cn−1
n−2
δ
n
δ
−2
a
n−2
δ
n
1
n
δn−1
δn−1
Proof. Since δk = det(A[1..k, 1..k]) = 0 for k = 1, . . . , n, by Theorem 6.5 (and Proposition
6.2), we know that A has a unique LU -factorization. Therefore, it suffices to check that the
proposed factorization works. We easily check that
(LU )k k+1 = ck,
1 ≤ k ≤ n − 1
(LU )k k−1 = ak, 2 ≤ k ≤ n
(LU )k l = 0,
|k − l| ≥ 2
δ
(LU )
1
1 1
=
= b
δ
1
0
a
(LU )
kck−1δk−2 + δk
k k
=
= b
δ
k,
2 ≤ k ≤ n,
k−1
since δk = bkδk−1 − akck−1δk−2.
It follows that there is a simple method to solve a linear system, Ax = d, where A is
tridiagonal (and δk = 0 for k = 1, . . . , n). For this, it is convenient to “squeeze” the diagonal
matrix, ∆, defined such that ∆k k = δk/δk−1, into the factorization so that A = (L∆)(∆−1U),
and if we let
c
δ
δ
z
1
k−1
n
1 =
,
z
,
2 ≤ k ≤ n − 1, z
= b
b
k = ck
n =
n − anzn−1,
1
δk
δn−1
6.3. GAUSSIAN ELIMINATION OF TRIDIAGONAL MATRICES
177
A = (L∆)(∆−1U ) is written as
1 z
1
c1
1
z2
z
1
c
a
2
1
z
2
3
z
2
c
a
3
A =
3
.
.
z3
.
. ..
.
.
. .
. ..
c
n−1
1
z
n−2
an−1
zn−1
an
zn
1
z
n−1
1
As a consequence, the system Ax = d can be solved by constructing three sequences: First,
the sequence
c
c
z
1
k
1 =
,
z
,
k = 2, . . . , n − 1, z
b
k =
n = bn − anzn−1,
1
bk − akzk−1
corresponding to the recurrence δk = bkδk−1 − akck−1δk−2 and obtained by dividing both
sides of this equation by δk−1, next
d
d
w
1
k − akwk−1
1 =
,
w
,
k = 2, . . . , n,
b
k =
1
bk − akzk−1
corresponding to solving the system L∆w = d, and finally
xn = wn,
xk = wk − zkxk+1, k = n − 1, n − 2, . . . , 1,
corresponding to solving the system ∆−1U x = w.
Remark: It can be verified that this requires 3(n − 1) additions, 3(n − 1) multiplications,
and 2n divisions, a total of 8n − 6 operations, which is much less that the O(2n3/3) required
by Gaussian elimination in general.
We now consider the special case of symmetric positive definite matrices (SPD matrices).
Recall that an n × n symmetric matrix, A, is positive definite iff
x Ax > 0 for all x ∈ n
R with x = 0.
Equivalently, A is symmetric positive definite iff all its eigenvalues are strictly positive. The
following facts about a symmetric positive definite matrice, A, are easily established (some
left as an exercise):
178
CHAPTER 6. GAUSSIAN ELIMINATION, LU, CHOLESKY, ECHELON FORM
(1) The matrix A is invertible. (Indeed, if Ax = 0, then x Ax = 0, which implies x = 0.)
(2) We have ai i > 0 for i = 1, . . . , n. (Just observe that for x = ei, the ith canonical basis
vector of
n
R , we have ei Aei = ai i > 0.)
(3) For every n × n invertible matrix, Z, the matrix Z AZ is symmetric positive definite
iff A is symmetric positive definite.
Next, we prove that a symmetric positive definite matrix has a special LU -factorization
of the form A = BB , where B is a lower-triangular matrix whose diagonal elements are
strictly positive. This is the Cholesky factorization.
6.4
SPD Matrices and the Cholesky Decomposition
First, we note that a symmetric positive definite matrix satisfies the condition of Proposition
6.2.
Proposition 6.9. If A is a symmetric positive definite matrix, then A[1..k, 1..k] is symmetric
positive definite, and thus, invertible, for k = 1, . . . , n.
Proof. Since A is symmetric, each A[1..k, 1..k] is also symmetric. If w ∈ k
R , with 1 ≤ k ≤ n,
we let x ∈ n
R
be the vector with xi = wi for i = 1, . . . , k and xi = 0 for i = k + 1, . . . , n.
Now, since A is symmetric positive definite, we have x Ax > 0 for all x ∈ n
R
with x = 0.
This holds in particular for all vectors x obtained from nonzero vectors w ∈ k
R as defined
earlier, and clearly
x Ax = w A[1..k, 1..k] w,
which implies that A[1..k, 1..k] is positive definite Thus, A[1..k, 1..k] is also invertible.
Proposition 6.9 can be strengthened as follows: A symmetric matrix A is positive definite
iff det(A[1..k, 1..k]) > 0 for k = 1, . . . , n.
The above fact is known as Sylvester’s criterion. We will prove it after establishing the
Cholseky factorization.
Let A be a symmetric positive definite matrix and write
a
A =
1 1
W
.
W
C
√
Since A is symmetric positive definite, a1 1 > 0, and we can compute α =
a1 1. The trick is
that we can factor A uniquely as
a
α
0
1
0
α W /α
A =
1 1
W
=
,
W
C
W/α I
0 C − W W /a1 1
0
I
i.e., as A = B1A1B1 , where B1 is lower-triangular with positive diagonal entries. Thus, B1
is invertible, and by fact (3) above, A1 is also symmetric positive definite.
6.4. SPD MATRICES AND THE CHOLESKY DECOMPOSITION
179
Theorem 6.10. (Cholesky Factorization) Let A be a symmetric positive definite matrix.
Then, there is some lower-triangular matrix, B, so that A = BB . Furthermore, B can be
chosen so that its diagonal elements are strictly positive, in which case, B is unique.
Proof. We proceed by induction on k. For k = 1, we must have a1 1 > 0, and if we let
√
α =
a1 1 and B = (α), the theorem holds trivially. If k ≥ 2, as we explained above, again
we must have a1 1 > 0, and we can write
a
α
0
1
0
α W /α
A =
1 1
W
=
= B
W
C
W/α I
0 C − W W /a
1A1B1 ,
1 1
0
I
√
where α =
a1 1, the matrix B1 is invertible and
1
0
A1 = 0 C − WW /a11
is symmetric positive definite. However, this implies that C − W W /a1 1 is also symmetric
positive definite (consider x A
n
1x for every x ∈ R
with x = 0 and x1 = 0). Thus, we can
apply the induction hypothesis to C − W W /a1 1, and we find a unique lower-triangular
matrix, L, with positive diagonal entries, so that
C − W W /a1 1 = LL .
But then, we get
α
0
1
0
α W /α
A =
W/α I
0 C − W W /a1 1
0
I
α
0
1
0
α W /α
=
W/α I
0 LL
0
I
α
0
1 0
1
0
α W /α
=
W/α I
0 L
0 L
0
I
α
0
α W /α
=
.
W/α L
0
L
Therefore, if we let
α
0
B =
,
W/α L
we have a unique lower-triangular matrix with positive diagonal entries and A = BB .
The proof of Theorem 6.10 immediately yields an algorithm to compute B from A. For
j = 1, . . . , n,
j−1
1/2
bj j =
aj j −
b2jk
,
k=1
180
CHAPTER 6. GAUSSIAN ELIMINATION, LU, CHOLESKY, ECHELON FORM
and for i = j + 1, . . . , n,
j−1
bi j =
ai j −
bi kbj k /bj j.
k=1
The above formulae are used to compute the jth column of B from top-down, using the first
j − 1 columns of B previously computed, and the matrix A.
The Cholesky factorization can be used to solve linear systems, Ax = b, where A is
symmetric positive definite: Solve the two systems Bw = b and B x = w.
Remark: It can be shown that this methods requires n3/6 + O(n2) additions, n3/6 + O(n2)
multiplications, n2/2+O(n) divisions, and O(n) square root extractions. Thus, the Cholesky
method requires half of the number of operations required by Gaussian elimination (since
Gaussian elimination requires n3/3 + O(n2) additions, n3/3 + O(n2) multiplications, and
n2/2 + O(n) divisions). It also requires half of the space (only B is needed, as opposed to
both L and U ). Furthermore, it can be shown that Cholesky’s method is numerically stable.
Remark: If A = BB , where B is any invertible matrix, then A is symmetric positive
definite.
Proof. Obviously, BB
is symmetric, and since B is invertible, B is invertible, and from
x Ax = x BB x = (B x) B x,
it is clear that x Ax > 0 if x = 0.
We now give three more criteria for a symmetric matrix to be positive definite.
Proposition 6.11. Let A be any n × n symmetric matrix. The following conditions are
equivalent:
(a) A is positive definite.
(b) All principal minors of A are positive; that is: det(A[1..k, 1..k]) > 0 for k = 1, . . . , n
(Sylvester’s criterion).
(c) A has an LU -factorization and all pivots are positive.
(d) A has an LDL -factorization and all pivots in D are positive.
Proof. By Proposition 6.9, if A is symmetric positive definite, then each matrix A[1..k, 1..k] is
symmetric positive definite for k = 1, . . . , n. By the Cholsesky decomposition, A[1..k, 1..k] =
Q Q for some invertible matrix Q, so det(A[1..k, 1..k]) = det(Q)2 > 0. This shows that (a)
implies (b).
6.5. REDUCED ROW ECHELON FORM
181
If det(A[1..k, 1..k]) > 0 for k = 1, . . . , n, then each A[1..k, 1..k] is invertible. By Proposi-
tion 6.2, the matrix A has an LU -factorization, and since the pivots πk are given by
a
11 = det(A[1..1, 1..1])
if k = 1
πk =
det(A[1..k, 1..k])
if k = 2, . . . , n,
det(A[1..k − 1, 1..k − 1])
we see that πk > 0 for k = 1, . . . , n. Thus (b) implies (c).
Assume A has an LU -factorization and that the pivots are all positive. Since A is
symmetric, this implies that A has a factorization of the form
A = LDL ,
with L lower-triangular with 1’s on its diagonal, and where D is a diagonal matrix with
positive entries on the diagonal (the pivots). This