Statistics 471 { Homework 3
Due Tuesday March 6, 2018
Homework Assignment Policy and Guidelines
(a) Homework assignments should be well organized and reasonably neat. It is required
that you show your work in order to receive credit.
(b) Unless otherwise stated in a problem, please use R Markdown to write homework
answers.
(c) Homework assignments are due in class unless otherwise noted. Credit will not be
given for homework turned in late.
(d) You may be asked to submit some homework problems online, in addition to a hard
copy that you turn in in class. Such homework problems will be marked online submission .
Your submission should be combined into one PDF or HTML document.
(e) Unless it is speci cally stated otherwise, you may work on and submit your homework
in groups of 1 or 2. If you choose to work as a group of 2, both of you should con-
tribute signi cantly to the solution for every question and submit only one copy of the
homework with both your names on it. Whether you submit on your own or with a
partner, discussing homework with your fellow students is encouraged. However, after
discussions, every group must ultimately produce their own homework to be graded.
Verbatim copying of homework is absolutely forbidden.
The dataset olympic.csv gives the performance of 34 men at the 1988 Olympics decathalon.
The dataset has 34 rows and 11 columns. The rst 10 columns correspond to the events of
the decathlon: 100 meters (m100), long jump (ljump), shotput (shot), high jump (hjump),
400 meters (m400), 110-meter hurdles (m110h), discus throw (disc), pole vault (pvault),
javelin (jav) and 1500 meters (m1500). The last column corresponds to the nal score of the
particular individual at the Olympics. This dataset will be analyzed in Problems 1 and 2.
1. online submission In this exericse, we will make Cherno Faces for data visualization.
Cherno Faces was proposed in 1973 by Herman Cherno , as a way of displaying multivariate
data. By capitalizing on the fact that human beings are very good at recognizing faces, he
let each feature of the human face (shape of nose, shape of eyes, hair length, etc) correspond
to a variable in the dataset. As each variable in the dataset varies, the corresponding feature
of the face varied as well.
This method isn’t used widely these days, because most datasets have many more variables
than potential features on a face.1 But since our Olympic dataset has only 11 columns,
we’ll try this out. We will use the package aplpack. The manual can be found at https:
//cran.r-project.org/web/packages/aplpack/aplpack.pdf.
(a) By reading the manual, nd a way to implement Cherno Faces for the rst ten columns
of the dataset. In other words, implement Cherno Faces for the ten Olympic events.
1But more practically, humans can only absorb up to so many features.
1
(b) As there are ten events, there should be ten features on the faces. Write down the events
which the features correspond to, and describe how they vary. Here is an example.
The long jump corresponds to the length of the hair. The further the distance jumped,
the longer the length of the hair.
(c) Based on the Cherno Faces you’ve generated, can you tell anything about the data?
For example, if you see a group of faces (perhaps Face 3, 6, 10, 14) having similar
features, what would that say about the corresponding variables? Write down your
observations. Do they match with the data given?
(d) There is one outlier in the data given, and you can spot it as one face generated will
be extremely di erent compared to all the other faces. Which face is it, and why is it
an outlier?
2. online submission While Cherno Faces are useful for determining the relationships
between variables in a dataset at a glance, they’re not very precise. We thus move on to
PCA. Suppose X is the data matrix corresponding to the data in olympic.csv, up to the
rst 10 columns (we ignore the score).
(a) First, remove the outlier that have been detected in Problem 1. Then perform. PCA
on this data. Should the variables be scaled?
(b) Create a scree plot and a plot of cumulative proportion of variance. Which principle
components are important?
(c) We’ll now examine the principal components identi ed in Part (b). By looking at the
signs and magnitudes of the entries in these principal components, give an interpreta-
tion of what they may represent. Does your interpretation correspond to the data you
have? If you think there’s no interpretation (eg, it’s just noise), then state it.
(d) Did you nd any similarities between your observations using Cherno Faces and PCA?
Give a brief explanation.
3. online submission Write a program to implement the secant root- nding method:
(1) xn+1 = xn f(xn) xn 1 xnf(x
n 1) f(xn)
:
(a) First test your program by nding the root of the function f(x) = cos(x) x.
(b) Next see how the secant method performs in nding the root of f(x) = log(x) exp( x)
using x0 = 1 and x1 = 2. Compare its performance with that of the other two methods.
(c) Write a function secant show.r that plots the sequence of iterates produced by the
secant algorithm.
2
4. online submission In the textbook (Jones et al), the Newton Raphson code is given.
I reproduce it here for convenience. I strongly suggest playing about with this code and
looking at example inputs / outputs to see what this function does, before continuing on.
1. newtonraphson tol) && (iter tol) {
26. cat("Algorithm failed to converge\n")
27. return(NULL)
28. } else {
29. cat("Algorithm converged\n")
30. return(x)
31. }
32. }
We are going to use the Newton Raphson Algorithm as a baseline when looking at another
root nding method: Halley’s method.
(a) Modify lines 29-30 of the function newtonraphson to give the output x together with
the number of iterations needed for the algorithm to converge. You may add addi-
tional lines of code. The output can be in any format you want. Call this function
newtonraphson1.
3
(b) In addition to the rst derivative, Halley’s method requires second derivative informa-
tion as well. Thus, if xn is our current solution then:
xn+1 = xn f(xn)f0(x
n) (f(xn)f00(xn)=2f0(xn))
Write a function halley1 that takes in four inputs:
halley1<-function(ftn, x0, tol, max.iter){
# ftn - a function of a single variable that returns
# the function value, the first derivative, and the
# second derivative as a vector of length 3.
# x0 is the initial guess at the root
# the algorithm terminates when the function value is within distance
# tol of 0, or the number of iterations exceeds max.iter
return(c(root_x, num_iter))
}
The output is a vector of length 2. If the algorithm does not converge, then root x
should be NA, and num iter = -1. Else, root x should be the approximate root x
returned by the algorithm, and num iter the number of iterations it takes for conver-
gence.
(c) In both newtonraphson1 and halley1, we had an input ftn that was a function of
a single variable which returns the function value, the rst derivative, and the second
derivative (for halley1). However, this means we would have to write a new function
for every f(x) we have. For example, if we wanted to nd the root of f(x) = x2 using
halley1, we would have to write:
ftnxsquare<-function(x){
fx = x^2
dfx = 2*x
ddfx = 2
return(c(fx,dfx,ddfx))
}
and then run:
results = halley1(ftnxsquare, 5, 1e-6, 100)
4
Let’s see if we can simplify this process by resorting to calculus. Recall that we have
the de nition of:
f0(x) = lim !0 f(x+ ) f(x)
Using this idea, write a function FindDeriv of the following form.:
FindDeriv<-function(ftn, x, eps = ...){
# a ftn of a single variable f(x), which takes in x
# and returns f(x), f’(x) and f’’(x)
return(c(fx,dfx,ddfx))
}
where eps is an optional parameter. The function should return a vector of ftn’s
output at x, the rst derivative at x, and second derivative at x.
(d) Write newtonraphson2 and halley2 where the respective derivatives are analytically
computed.
(e) We now compare how these four methods do in root nding.
For each question, report how fast each method takes to converge (number of itera-
tions), report the root it converges to, report the true root, for all of the given starting
values of x0.
You are encouraged to present the answers in a format which allows easy comparison.
(i) The square root of 26 with x0 = 1;3;6
(ii) 2x 3x + 1 with x0 = 2;0;2
(iii) x3 7x2 + 14x 8 with x0 = 1:1;1:2;:::;1:9
(f) Based on the answers you get in Part (e), which method would you prefer to use
in general cases of root nding? Are there any methods you would prefer to use in
specialized cases? Explain your answers.