首页 >
> 详细

Big Data Stats II Name:__________

STAT4650 Sample Final Exam Solution

Spring 2020

Instructions

1. Make sure to read or parse the entire exam.

2. Open book and open note.

3. You are allowed to use a scientific calculator.

4. Exiting the exam under any circumstance is final, and you will NOT be allowed to take it on

your second attempt.

5. If any two or more of you submit identical answers to essay questions, it will result in a score

of zero for that question.

6. If you copy the content from the textbook/handout/solution, it will result in a score of zero for

that question.

7. You have 120 minutes to complete the test.

8. Make sure you hit the ‘Submit’ button once you are done with your exam.

9. If you have any questions, simply enter my online meeting room via zoom and I will help you

with your questions.

10. Don't panic.

Students in my class are required to adhere to the standards of conduct set by Clark University

and GSOM. Please sign the following Honor pledge that signifies your understanding of the

rules set by the code of conduct.

“I pledge my honor that I have not violated Clark University's Code of Conduct during this

examination.”

Please sign here to acknowledge_____________________________

2

1. This question is about trees and random forests.

(a) [5 pts] Sketch the tree corresponding to the partition of the predictor space illustrated in the

following Figure. The Ri inside the boxes indicate region i.

Sol:

3

(b) [5 pts] Create a partition, using the tree illustrated in the following figure. Drag items onto

the image.

Solution:

(c) [5 pts] What is a Bootstrap Aggregation of Decision Trees? Explain (2-3 sentences).

Sol:

We first generate B different bootstrapped training datasets. Construct B decision trees on each

of the B training datasets, and obtain the prediction. To do prediction, we take average of all

predictions from all B regression trees. In case of classification problem, we take majority vote

among all B trees.

4

(d) [5 pts] How does a Random Forest differ from a Bootstrap Aggregation of Decision Trees?

Explain (2-3 sentences).

Solution:

Build a number of decision trees on bootstrapped training sample, but when building these trees,

each time a split in a tree is considered, a random sample of m predictors is chosen as split

candidates from the full set of p predictors (Usually = √).

5

2. This question relates to Support Vector Machines and uses the data below.

(a) [5 pts] We are given n = 7 observations in p = 2 dimensions. Horizontal axis corresponds

to 1 and Vertical axis corresponds 2. For each observation, there is an associated class

label. Sketch the observations in the coordinate grid.

(b) [5 pts] Provide 0, 1, 2 for the maximum margin separating hyperplane defined by

0 + 11 + 22 = 0.

(c) [5 pts] Indicate the support vectors for the maximal margin classifier (you may answer

this question by writing down the coordinates of the support vectors).

(d) [5 pts] Argue that a slight movement of the seventh observation would not affect the

maximal margin hyperplane.

Obs. 1 2 Y

1 3 4 Red

2 2 2 Red

3 4 4 Red

4 1 4 Red

5 2 1 Blue

6 4 3 Blue

7 4 1 Blue

Solution: (a)

()

(b) 0.5 − 1 + 2 = 0 (Note: any equation close to this one is Okay)

6

If a point falls above the given line meaning that 0.5 − 1 + 2 > 0, we classify the point as

“red” while if our point is below the given line meaning that 0.5 − 1 + 2 < 0, we classify the

point as “blue”.

(c) The support vectors are the four points that pass though the gray lines.

These points are (2,1), (2, 2), (4,3), (4,4).

(d)The seventh point is located at (4, 1) which is far from the separating hyperplane and not

close to any of the supporting vectors which determine the separating hyperplane. As such small

movements in its location won’t change the separating hyperplane.

7

3. (a) [5 pts] Considering the two methods “k-means clustering” and “k-nearest neighbors”,

which is a supervised learning algorithm and which is an unsupervised learning algorithm?

Solution:

“k-means clustering” is unsupervised learning while “k-nearest neighbors” is supervised

learning

(b) [5 pts] What quantity does PCA minimize when it generates each principle component?

Solution: the sum of the squared perpendicular distances to each point

(c) [5 pts] What are the optimum number of principle components in the below figure?

Solution:

We can see in the above figure that the number of components = 30 is giving highest

variance with lowest number of components.

8

4. Suppose that we have four observations, for which we compute a dissimilarity matrix, given

by

For instance, the dissimilarity between the first and second observations is 0.3, and the

dissimilarity between the second and fourth observations is 0.8.

(a) [5 pts] On the basis of this dissimilarity matrix, sketch the dendrogram that results from

hierarchically clustering these four observations using complete linkage. Be sure to indicate on

the plot the height at which each fusion occurs, as well as the observations corresponding to each

leaf in the dendrogram. Drag items

(b) [5 pts] Repeat (a), this time using single linkage clustering.

(c) [5 pts] Suppose that we cut the dendrogram obtained in (a) such that two clusters result.

Which observations are in each cluster?

(d) [5 pts] Suppose that we cut the dendrogram obtained in (b) such that two clusters result.

Which observations are in each cluster?

(e) [5 pts] It is mentioned in the chapter that at each fusion in the dendrogram, the position of the

two clusters being fused can be swapped without changing the meaning of the dendrogram.

Draw a dendrogram that is equivalent to the dendrogram in (a), for which two or more of the

leaves are repositioned, but for which the meaning of the dendrogram is the same.

Solution:

(a)

9

10

(b)

(c)

(1,2), (3,4)

(d)

(1, 2, 3), (4)

(e)

11

5. Time Series Forecasting

(a) [10 pts] What are the differences between autoregressive and moving average models?

Solution: Autoregressive models specify the current value of a series yt as a function of its

previous p values and the current value an error term, ut, while moving average models

specify the current value of a series yt as a function of the current and previous q values

of an error term, ut. AR and MA models have different characteristics in terms of the

length of their “memories”, which has implications for the time it takes shocks to yt to die

away, and for the shapes of their autocorrelation and partial autocorrelation functions.

An autoregressive process has a geometrically decaying acf and a number of non-zero

points of pacf, which equal to AR order. A moving average process has a number of non-

zero points of acf that equal to MA order and a geometrically decaying pacf.

(b) [10 pts] A researcher wants to test the order of integration of some time series data. He

decides to use the DF test. He estimates a regression of the form

∆ = + −1 +

and obtains the estimate = −0.023 with standard error = 0.009. What are the null and

alternative hypotheses for this test? Given the data, and a critical value of −2.86, perform

the test. What is the conclusion from this test and what should be the next step?

Solution: The null hypothesis is of a unit root against a one sided stationary alternative, i.e. we

have

H0 : yt is non-stationary process

H1 : yt is stationary process

which is also equivalent to

H0 : = 0

H1 : < 0

The test statistic is given by /() which equals -0.023 / 0.009 = -2.556. Since this is not more

negative than the appropriate critical value, we do not reject the null hypothesis.

We therefore conclude that there is at least one unit root in the series (there could be 1, 2, 3 or

more). What we would do now is to regress 2yt on yt-1 and test if there is a further unit root. The

null and alternative hypotheses would now be:

H0 : yt I(1) i.e. yt I(2)

H1 : yt I(0) i.e. yt I(1)

If we rejected the null hypothesis, we would therefore conclude that the first differences are

stationary, and hence the original series was I(1). If we did not reject at this stage, we would

conclude that yt must be at least I(2), and we would have to test again until we rejected.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- Cs2461-10实验程序代做、代写java，C/C++，Python编程设 2021-03-02
- 代写program程序语言、代做python，C++课程程序、代写java编 2021-03-02
- Programming课程代做、代写c++程序语言、Algorithms编程 2021-03-02
- 代写csc1-Ua程序、代做java编程设计、Java实验编程代做 代做留学 2021-03-02
- 代做program编程语言、代写python程序、代做python设计编程 2021-03-02
- 代写data编程设计、代做python语言程序、Python课程编程代写 代 2021-03-02
- Cse 13S程序实验代做、代写c++编程、C/C++程序语言调试 代写留学 2021-03-02
- Mat136h5编程代做、C/C++程序调试、Python，Java编程设计 2021-03-01
- 代写ee425x实验编程、代做python，C++，Java程序设计 帮做c 2021-03-01
- Cscc11程序课程代做、代写python程序设计、Python编程调试 代 2021-03-01
- 代写program编程、Python语言程序调试、Python编程设计代写 2021-03-01
- 代做r语言编程|代做database|代做留学生p... 2021-03-01
- Data Structures代写、代做r编程课程、代做r程序实验 帮做ha 2021-03-01
- 代做data留学生编程、C++，Python语言代写、Java程序代做 代写 2021-03-01
- 代写aps 105编程实验、C/C++程序语言代做 代写r语言程序|代写py 2021-03-01
- Fre6831 Computational Finance 2021-02-28
- Sta141b Assignment 5 Interactive Visu... 2021-02-28
- Eecs2011a-F20 2021-02-28
- Comp-251 Final Asssessment 2021-02-28
- 代写cs1027课程程序、代做java编程语言、代写java留学生编程帮做h 2021-02-28