首页 > > 详细

STAT4051 Assignment 5

 STAT4051 Assignment 5 

Show all work for full credit.
20 pts 
Cathedral Problem 
The cathedral dataset contains the length in feet, y, nave height, x, and style (Gothic and 
Romanesque) of 25 medieval English cathedrals. Assume that y is the response, x is the 
covariate and style is the treatment. 
library (faraway)
data(cathedral, package=”faraway”)
a. Plot the data and indicate the different cathedral styles on the plot.
b. Determine the final model to predict cathedral length in feet.
c. Did you include a covariate in your final model? If so, quantify its benefit.
d. How much does length change for a 10 unit increase in height?
e. How much does length change for the different styles of cathedrals? 
Oehlert Problem 17.1
You may find the following code helpful:
prob17.1<-
read.table(“http://www.stat.umn.edu/~gary/book/fcdae.data/pr17.1”,header=TRUE)
a. Analyze these data with respect to the effect of pesticide on calcium in bones.
b. Write your final model. Hint: Use dummy variables to represent the various 
pesticides.
c. Create a plot for your final model.
d. Estimate the average calcium concentration in bones for a diameter of 3.0 mm based
on your final model. 
e. Quantify the benefit, if any, of including a covariate in the model.
Glass Problem
Criminal investigators involve classifying the type of glass at a crime scene into 1 of 7 
categories based on chemical composition:
Type Description
1 building windows float processed
2 building windows non-float processed
3 vehicle windows float processed
4 vehicle windows non-float processed
5 containers
6 tableware
7 headlamps
1
STAT4051 Assignment 5 
 At the scene of the crime, the glass left can be used as evidence… if it is correctly identified.
You may find the following code helpful:
glass<-read.csv("https://archive.ics.uci.edu/ml/machine-learning￾databases/glass/glass.data")
attach(glass)
# Name the variables
colnames(glass) <- c("id","ri","Na","Mg","Al", "Si", "K",
 "Ca", "Ba", "Fe", "Type")
a. Examine the covariance matrix. What variable(s) dominate?
b. Perform principal components on the covariance matrix. What does the first 
eigenvector describe?
c. Examine the correlation matrix. What are the large correlations?
d. Perform principal components on the correlation matrix. Describe the first 
eigenvector.
e. What matrix (covariance or correlation) should principal components be performed
on? Justify your choice.
f. How many components should be retained. Justify your choice.
Turtle Problem
Download the turtles.csv dataset from Canvas. 
Sex is coded as Female=1 and Male =2.
a. Obtain the sample correlation matrix R for these data and discuss your findings. 
b. Show the first principal component is orthogonal to the second principal 
component.
c. Write the equation for the first principal component. 
d. Create a Scree plot and determine how many components should be retained.
e. Interpret the retained components.
f. Run the following code to examine the pairs of principal components. What plot 
separates the turtles sex the most? What does this plot tell you about turtles with 
respect to sex?
pairs(~PC1 + PC2 + PC3,data=your dataset name, col=your 
variable name for sex)
STAT4051 Assignment 5 
Womens Track Problem
Download and import the dataset on Womens Track Records from Canvas. Remember to 
set the header=TRUE to retain the original variable names. 
a. Obtain the sample correlation matrix R for these data and discuss your findings. 
b. Create a Scree plot and determine how many components should be retained.
c. Interpret the first two principal components.
d. Plot the first two principal components. Identify the top three winners. Hint: rank 
the nations based on the principal component scores for the first component.
Food for thought - 
http://homepage.divms.uiowa.edu/~dzimmer/sports-statistics/Naik-Khattree.pdf
 
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!