# Homework 10

Homework 10

1. • Required: View https://www.youtube.com/watch?v=k3AiUhwHQ28. This is a lecture
by Suvrit Sra. He is guest lecturing in Gil Strang’s MIT course on computational
methods in machine learning. View the whole lecture as it connects to many themes we
have been discussing.
• Optional: Section 3.1, 3.2 of Sauer discuss polynomial interpolation. Section 3.4 dis￾cusses spline interpolation.
2. This problem provides an example of how interpolation can be used. The attached spreadsheet
provides life expectancy data for the US population. The second column gives the probability
of death for the given age. So, for example, the probability that a person between the ages
of 20 and 21 dies is 0.000894.
Suppose a 40 year old decides to buy life insurance. The 40 year old will make monthly
payments of \$200 every month until death. In this problem we will consider the worth of
these payments, a quantity of interest to the insurance company. The payoff upon death will
not be considered in this problem. If we assume (continuous time) interest rates of 5% and
let m be the number of months past age 40 that the person lives, then the present value of
the payments (how much future payments are worth in today’s dollars) is,
Our goal is to determine the average of PV, in other words E[PV]. For the insurance company,
this is one way to measure the revenue brought in by the policy. The difficulty is that our data
is yearly, while payments are made monthly and people do not always die at the beginning
of the month.
(a) Let L(t) be the probability the 40 year old lives past the age 40+t where t is any positive
real number. Estimate L(t) by first considering t = 0, 1, 2, . . . . These values of L(t) can
be computed using the spreadsheet data. (For example, for the 40 year old to live to 42,
they must not die between the ages 40 0 41 and 41 ∞ 42). For other t values, interpolate
using a cubic spline. In R you can use the spline and splinefun commands to construct
cubic splines, see the help documentation. Graph the interpolating cubic spline of L(t)
and include the datapoints, i.e. L(t) for t = 0, 1, . . . ..
(b) Explain why the expected (average) present value of the payments is given by
In practice we can’t sum to ∞, choose an appropriate cutoff and calculate E[PV].
3. Consider the MNIST dataset from homework 6. Recall, in that homework, we used a logistic
regression in 784 dimensions to build a classifier for the number 3. Here, we will use PCA to
visualize and dimensionaly reduce the dataset.
(a) In order to visualize the dataset, apply a two-dimensional PCA to the dataset and
plot the coeffecients for the first two principle components. Use orthogonalized power
iteration to compute the two principle components yourself. (Don’t forget to subtract
off the mean!) Color the points according to the number represented by the image in
the sample, i.e. the value given in the first column of mtrain.csv. (You can use the
first 1000 rows since plotting 60, 000 points takes a while.)
(b) Apply the PCA to reduce the dimensionality of the dataset from 784 to a dimension k.
(Don’t forget to subtract off the mean!) For some different values of k, do the following
i. Determine the fraction of the total variance captured by the k-dimensional PCA.
ii. In the file mnist_intro the function show_image displays the image given a
vector of pixels. (For example, if the vector v contains the 784 pixels of a particular
image, then show_image(v) will display the image.) For each value of k, compute
the projection of the image (i.e. 784 dimensional vector) onto the principle compo￾nents. (The projected image will still be a 784 dimensional vector, but it will have
k pca coefficients; one for each principle component.) Then use show_image(v)
to compare the original image to the projected image. For what k, can you begin
to discern the number in the projected image?
What value of k do you think captures the dataset well?
(c) Given your results in (b), choose a dimension k and reduce the dataset from 784 di￾mensions to k dimensions. Then, build a classifier based on the k dimensional dataset.
Fit the logistic regression to the whole dataset using a stochastic gradient approach as
discussed in the YouTube video (mentioned above). Use the mtest.csv dataset to
test the accuracy of your dataset. Comment on the time needed to compute the logistic
regression and its accuracy relative to what you found in hw 6.