首页 > > 详细

MATH3204/7234 Assignment 03

 MATH3204/7234 (S2-2022): Assignment 03

1. QR Factorization [2 marks each]
(a) Gram–Schmidt Orthogonalization: Let m = 1, 000, n = 100, and indepen￾dently generate two real-valued random m×n/2 matrices A1 and A2. Now consider
the m×n matrix A = [A1 | A1 +εA2] for ε = 10−8
(Here, [A | B] denotes a matrix
that is formed by stacking A and B horizontally, i.e., column wise).
(i) Use Matlab or Python to find the rank, the smallest, and the largest singular
values of A.
(ii) Implement and apply the classical Gram–Schmidt to obtain the matrix Q, and
subsequently measure k I − Q
| Qk (Note: the norm here is the matrix spectral
norm.)
(iii) Now, do the same using the modified Gram–Schmidt procedure.
(iv) What do you observe and why? Justify your answer.
(b) Householder Reflections: An alternative method for computing the QR factor￾ization in the celebrated Householder triangularization, which is numerically more
stable than (both versions of) the Gram–Schmidt procedure. We saw in the lecture
that Gram–Schmidt iteration implies a successive application of triangular matrices
on the right of A as
A R1R2 . . . Rn
|
{z
}
R−1
= Q.
In contrast, the Householder method amounts to a succession of elementary unitary
matrices on the left of A as
Qn . . . Q2Q1
|
{z
}
Q∗
A = R.
The Householder method is based on an ingenious idea originally proposed by Al￾ston Householder in 1958. Essentially, each matrix Qk is chosen to introduce zeros
below the k
th column column of A, while preserving all the zeros previously intro￾duced. More specifically, consider the first column of A, i.e., a1. The orthogonal
1
transformation Q1 if formed such that
Q1a1 =
r11
0
.
.
.
0
.
Among all possible choices for such Q1, the Householder algorithm chooses Q1 to
be a particular matrix called a Householder reflector.
Definition
Given a unit vector u, the matrix F = I−2uu∗
is called a Householder reflector.
The term “reflection” comes from the fact that Fu = −u.
(i) Show that any Householder reflector matrix is unitary.
(ii) Suppose the matrix A is real. Show that Q1 can be formed with either of
u =
1
 
 
 
 
 
 
a1 + k a1k e1
 
 
 
 
 

a1 + k a1k e1
 ,
or
u =
1
 
 
 
 
 
 
a1 − ka1k e1
 
 
 
 
 

a1 − ka1k e1
 .
Mathematically, either choice is perfectly fine. However, numerically, this is not the
case. In fact, to minimize sensitivity to rounding errors, it is desirable to reflect
a1 to the vector that is not too close to a1 itself. To achieve this “+/−” above is
chosen according to the sign of the first component of a1, i.e., a11. More specifically,
we set
u =
1
 
 
 
 
 
 
a1 + sign(a11) k a1k e1
 
 
 
 
 

a1 + sign(a11) k a1k e1
 .
2
So, after applying Q1, we will have
Q1A =
? ? ? . . . ?
0 ? ? . . . ?
0 ? ? . . . ?
0 ? ? . . . ?
.
.
.
.
.
.
.
.
. . . . ?
0 ? ? ? ?
,
where “? ” here just implies unspecified values that can be non-zero. To form Q2,
we let F2 be the Householder reflector corresponding to the “? ” part of the resulting
matrix, i.e.,
Q1A =
? ? ? . . . ?
0 ? ? . . . ?
0 ? ? . . . ?
0 ? ? . . . ?
.
.
.
.
.
.
.
.
. . . . ?
0 ? ? ? ?
,
and set
Q2 =
"
1
F2
#
.
Now, applying Q2 gives
Q2Q1A =
? ? ? . . . ?
0 ? ? . . . ?
0 0 ? . . . ?
0 0 ? . . . ?
.
.
.
.
.
.
.
.
. . . . ?
0 0 ? ? ?
.
The Householder method amounts to repeating a similar procedure n times to zero
out the appropriate parts of A to obtain an upper-triangular m × n matrix R with
non-negative diagonal entries. Note that unlike Gram–Schmidt, which generated
the reduced QR factors, the Householder method produces the full QR.
(iii) A naïve implementation of the Householder algorithma
is given in Appendix A.
Apply the algorithm to the matrix from Part (a) and measure k I − Q
| Qk .
3
(iv) Report your observation as compared with those from Part (a).
a
In our implemntation, we have not really tried to have an efficient implementation in terms of speed,
memory footprint, storage, etc. In practice, these issues are taken into account in the implementation.
2. Krylov Subspace [8 marks each]
Consider the linear system Ax = b where A ∈ R
n×n
is non-singular and b ∈ R
n
.
(a) Consider a generic Krylov subspace method that generates iterates as
xk ∈ x0 + Kk (A, r0), 1 ≤ k ≤ t,
where r0 = b − Ax0 and t is the grade of r0 with respect to A. Show that
Span{r0, r1, . . . , rk−1} ⊆ Kk (A, r0), k ≤ t, where ri = b − Axi
.
(b) Assuming A  0, consider the oblique projection framework where Wk = AKk,
and show that Span{r0, r1, . . . , rk−1} = Kk (A, r0), k ≤ t.
3. GMRES [5 marks each]
Consider the k
th iteration of the generalized minimum residual (GMRES) method for
solving Ax = b where A ∈ R
n×n and b ∈ R
n
. Let Hk+1,k = Uk+1,kRk be the reduced
QR factorization of Hk+1,k.
(a) Show that the least-squares sub-problems of GMRES, i.e.,
min
y
 
 
 
 
 
 
 
 
Hk+1,ky − kr0k e1
 
 
 
 
 
 
 
 ,
is solved using
y = Rk
−1U
|k+1,k k r0k e1,
where r0 is the initial residual r0 = b − Ax0 and e1 = [1, 0, 0, . . . , 0]| ∈ R
k+1
.
(b) Show that the residual at the k
th iteration can be computed as
k
b − Axkk = k r0k
r 1 −
 
 
 
 
 U
|k+1,ke1
 
 
 
 
 
2
.
There are many situations in which mathematical models lead to linear systems where
the matrix is non-symmetric. One such example is the numerical discretization of the
steady-state convection-diffusion equation. This equation describes physical phenomena
that involve interaction among particles of a certain material (diffusion), as well as a form
of motion (convection). An example here would be the manner in which a pollutant,
say, spreads as it moves in a stream of water. It moves along with the water and other
objects, and at the same time it changes its shape and chemical structure and spreads
4
as its molecules interact with each other in a certain manner that can be described
mathematically. In its simplest form, the convection-diffusion equation is defined on the
open unit square, 0 < x, y < 1, and reads
 
2u
∂x2
+
2u
∂y2
!
+ σ
∂u
∂x + τ
∂u
∂y = g(x, y).
The parameters σ and τ are associated with the convection. When they both vanish,
we get the celebrated Poisson equation. One also imposes a certain boundary condition,
e.g., homogeneous Dirichlet boundary. By considering a grid of points on the domain,
the solution u can be approximated using an appropriate discretization of the partial
derivatives on that grid, which gives rise to a linear system of the form Au = g. Here, A
is a discretization of the differential operator including the boundary conditions, u is a
vector whose components are the approximation of u on the discretization grid, and g is
a vector whose components are values of g on the grid. Obviously, to get more accurate
solutions, one needs to consider a finer grid, which in turn gives rise to a larger-scale
linear system.
The function lcd(β, γ, N) from Assignment 02, generates the matrix A using a discretiza￾tion mesh of size N ×N, and centered finite difference schemes for both the first and the
second partial derivatives, assuming homogeneous Dirichlet boundary. The values of β
and γ are related to σ and τ , respectively (as well as the grid width, which is related to
the fineness of the grid).
Set N = 100, β = γ = 0.1, and construct the convection-diffusion matrix A. To set up
the problem, find the right-hand vector g such that the solution u is a vector of all 1’s.
Now, forget that we actually know u and let’s use GMRES to solve the non-symmetric
linear system Au = g.
(c) Write your own program for implementing GMRES. Use the Householder reflections
code in Appendix A for computing the QR factorization of Hk+1,k. Run your code
on the above linear system, starting at x0 = 0, and stop the iterations when either
k
rkk / k bk ≤ 10−6 or the iteration count exceeds 1000. Plot the residuals across
iterations.
(d) Compare you result with that obtained using MATLAB’s gmres function or Python’s
scipy.sparse.linalg.gmres. Hopefully, it is identical, otherwise something is
wrong in your code.
For non-symmetric matrices, where Arnoldi cannot be replaced by Lanczos, there is a
significant price to pay when using GMRES. As we proceed with iterations, we accu￾mulate more and more basis vectors of the Krylov subspace. We need to have those
around since the solution is given as their linear combination. As a result, the storage
requirements keep creeping up as we iterate. This may become a burden if the matrix is
very large and if convergence is not very fast, which often occurs in realistic scenarios.
5
One way of dealing with this difficulty is by using restarted GMRES. We run GMRES
as described above, but once a certain maximum number of vectors, say, m  n, is
reached, we take the current iterate, xm, as our initial guess, i.e., x0 = xm, and start
again. Thus at any given time , the maximum number of vectors that are stored is no
more than m.
Of course, restarted GMRES is almost certain to converge more slowly than the full
GMRES. Moreover, we do not really have an optimality criterion anymore, although
we still have monotonicity in residual norm reduction. But despite these drawbacks of
restarted GMRES, in practice it is usually preferred over full GMRES, since the memory
requirement issue is critical in large scale settings.
(e) Write your own program for implementing restarted GMRES. As above, use your
own Householder reflections or the one provided in the Appendix for computing
the QR factorization of Hk+1,k. Run your code on the same example and the same
parameters as what we used above for full GMRES. Set the restart parameter to
be m = 20. Plot the residuals across iterations on the same plot as full GMRES so
you can compare both full and restarted variants all at once across iterations.
(f) Compare you result with that obtained using scipy.sparse.linalg.gmres or
MATLAB’s gmres functions using their respective restart parameter. Although your
code might run slower (since we did not focus on optimizing the code), it hopefully
can generate identical iterates, as otherwise something is wrong in your code. Plot
the residuals across iterations for Matlab’s or Python’s native function to verify
this.
4. Optimality Condition [2 marks each]
Consider the problem
min x
2
1 + 0.5x
2
2 + 3x2 + 4.5
s.t. x1, x2 ≥ 0.
Recall the first-order necessary condition for a local minimizer.
(a) Is the condition satisfied at (x1, x2) = (1, 3)? Justify your answer.
(b) Is the condition satisfied at (x1, x2) = (0, 3)? Justify your answer.
(c) Is the condition satisfied at (x1, x2) = (1, 0)? Justify your answer.
(d) Is the condition satisfied at (x1, x2) = (0, 0)? Justify your answer.
5. Optimality Condition
Let f : R
2 → R be a differentiable function. Suppose that a point x
? is a local minimum
of f along every line that passes through x
? , i.e., the function g(α) , f(x
? + αp) is
6
minimized at α = 0 for all p ∈ R
d
.
(a) Show that ∇f(x
? ) = 0. [5 marks]
(b) By way of an example, show that x
? need not be a local minimum of f. (Hint:
Consider the function of two variables f(x, y) = (y − β1x
2
)(y − β2x
2
) where 0 <
β1 < β2 and investigate the claim for (x, y) = (0, 0).) [15 marks]
6. Convexity [10 marks]
Let fi
, . . . , fn : C → R be n convex functions over the convex set C ⊆ R
d
. Show that the
function f : C → R defined as
f(x) = max
i=1,...,n
fi(x),
is convex over C. (Hint: You can use the fact that for any two sequence {ai}
n
i=1 and
{bi}
n
i=1, we have max
i=1,...,n
(ai + bi) ≤ max
i=1,...,n
ai + max
i=1,...,n
bi
.)
100 marks in total
Note:
• This assignment counts for 15% of the total mark for the course.
• Although not mandatory, if you could type up your work, e.g., LaTex, it would be
greatly appreciated.
• Show all your work and attach your code and all the plots (if there is a programming
question).
• Combine your solutions, all the additional files such as your code and numerical
results, all in one single PDF file.
• Please submit your single PDF file on Blackboard.
7
A Householder Reflections Code
1 function [Q , R ] = house ( A )
2 % function [A,p] = house (A)
3 %
4 % Perform QR decomposition using Householder reflections .
5 % Assume m > n.
6 % This code is not memory efficient and stores the factors Q. In a real
7 % code , one can only store the householder vectors , and store them in
8 % parts of A that have been zeroed out.
9
10 [m , n ]= size ( A ) ;
11 Q = eye(m , m ) ;
12 for k = 1: n
13 % define u of length = m-k+1
14 z = A ( k :m , k ) ;
15 e1 = [1; zeros (m -k ,1) ];
16 u = z + sign ( z (1) ) * norm ( z ) * e1 ; u = u / norm ( u ) ;
17 % update nonzero part of A by I -2 uu^T
18 A ( k :m , k : n ) = A ( k :m , k : n ) -2* u *( u ’* A ( k :m , k : n ) ) ;
19 r = length ( u ) ;
20 Q = blkdiag (eye(m - r ) , eye(r , r ) - 2* u *( u ’) ) * Q ;
21 end
22 Q = Q ’;
23 R = A ;
24 end
Listing 1: For Matlab use the above code snippet.
1 import numpy as np
2 from scipy . linalg import block_diag
3
4 def house ( A ) :
5 # Perform QR decomposition using Householder reflections .
6 # Assume m > n.
7 # This code is not memory efficient and stores the factors Q. In a real
8 # code , one can only store the householder vectors , and store them in
9 # parts of A that have been zeroed out.
10
11 m , n = A . shape
12 R = np . copy ( A )
13 Q = np . eye ( m )
14 for i in range ( n ) :
15 a1 = R [ i :m , i ]. reshape (m -i ,1)
16 e1 = np . zeros (( m -i ,1) )
17 e1 [0 ,0] = 1
18 u = a1 + np . sign ( a1 [0]) * np . linalg . norm ( a1 ) * e1
19 u = u / np . linalg . norm ( u )
20 R [ i :m , i : n ] = R [ i :m , i : n ] - 2* u . dot ( u . T . dot ( R [ i :m , i : n ]) )
21 P = block_diag ( np . eye ( i ) , np . eye (m - i ) - 2* u . dot ( u . T ) )
22 Q = P . dot ( Q )
8
23 return Q .T , R
24
Listing 2: For Python use the above code snippet.
9
联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!