# MATH 504 Homework 12

Homework 12

MATH 504
1. • Section 5.4 in Elements of Statistical Learning covers spline fitting with
penalty terms.
• Sections 12.3 and 12.4 in Sauer cover SVD.
2. This problem repeats the spline regression problems of hw 11, but now we
will use many knots. Specifically, consider the case 1000 knots equally spaced
between the minimum and maximum x values (ages) of the dataset. As be￾fore, for this problem let’s consider solely the female portion of the BoneMass
dataset (file attached).
B-splines are a choices of basis functions for splines that produce design/model
matrices that are not badly conditioned. In the setting of 1000 knots, such a
basis is essential. Choosing basis functions of the form hi(x) = ([x x ζi
]
+)
3 as
in the last homework will lead to huge condition numbers and the regression
will not be possible. This is an example of the importance of a good basis in
computation.
To manipulate B-splines, R provides the function splineDesign as part of the
splines package. Here are the two ways you will need to call splineDesign
to apply spline regression with a penalty method.
B <- splineDesign(knots=myknots, x=x, outer.ok=T)
Bpp <- splineDesign(knots=myknots, x=mygrid, derivs=2,
outer.ok = T)
Above, B will be a design matrix, meaning that Bij = bj (xi) where bj (x) is
the jth spline function in the B-spline basis and xi
is the ith x sample from the
dataset. Bpp is similar, except that given the flag derivs=2, Bppij = b
(i.e. the second derivative of bj (x)). See below for the meaning of mygrid.
Note: Using splineDesign, the dimension of the spline space is K K 4, where
K is the number of knots, rather than our usuual K + 4. splineDesign places
some constraints on the behavior of the splines at the end points that restrict
the dimension to K K 4 rather than K + 4. This issue is minor and does not
change the computations that must be done.. B will have dimension N by
K K 4 where N is the dimension of x. Bpp will have K K 4 columns, but the
number of rows will depend on the number of values in mygrid.
(a) Consider minimizing the following penalized least squares, which we dis￾cussed in classmin
where the zi are the points you specify in mygrid, b
00
`
(zi) is the i, ` entry
of Bpp, and ∆z is the distance between the points in mygrid. Make sure
mygrid forms a dense grid of values so that you get accurate estimates
using Riemann integration.
(c) Show that the matrix BT B is not invertible but that BT B + ρΩ is for
ρ > 0. You can do this by just computing the determinant for various
values of ρ in R or through theoretical arguments.
(d) Calculate α for (a) ρ = .01, (b) ρ = 1 and (c) ρ = 100. In each case
plot the smoothing spline (hint: B provides you discretized versions of
the bj (x), linearly combine them using α to form the fitted spline) and
the data points. Comment on the fit in each case.
3. This problem will demonstrate an application of svd. In the problem below,
the rows of the A matrix are formed from a single vector, with some added
noise. The rank of a matrix is the dimension of the span of its row or column
vectors (it turns out that the dimension is the same for row span and column
span). The noise means that the A matrix has a 10 dimensional rank, but
without the noise the A matrix would have a 1 dimensional rank. The svd
allows us to compute the lower rank matrix and extract the underlying signals
from A.
(a) The script signals.R constructs a 500 × 10 matrix, A, which is saved
to the file A.txt. Each row is of A has the form (q*sig + noise). sig is a
fixed signal, where the signal is a 10 dimensional vector. Also saved, in the
file no_noise_A, is a matrix with rows given by q*sig (i.e. no noise).
Finally, the file q gives the q value used for each row. Look through
signals.R and make sure you understand how A is constructed and
the from of sig. Plot the values in the first row of A, the first row of
no_noise_A, and the underlying signal. Can you tell what the signal is
by looking at A? By averaging the columns of A?
(b) Perform an svd on A using R’s svd function. Plot the singular values
and comment on their values given what we know about A. Consider
the approximation A1 of A, where A1 = s1u
(1)(v(1))
. Use image(A),
image(no_noise_A) and image(A_1) to visualize the three matrices
and confirm that A1 removes the noise from A. Compare the first row of
A, no_noise_A , and A1. Given a row of A in the form q*sig + noise,
what role do v
(1), s1 and u
(1) have in capturing q and sig?

• QQ：99515681
• 邮箱：99515681@qq.com
• 工作时间：8:00-23:00
• 微信：codinghelp2 