BBTRY 4030 - Fall 2018 - Homework 4 Q4
Put Your Name and NetID Here
Due Friday, November 9, 2018
Instructions:
Create your homework solution file by editing the “hw4-2018_q4.Rmd” Rmarkdown file provided. Your
solution to this homework assignment should include the relevant R code and output (fit summaries, ANOVA
tables and computed statistics, as well as requested plots) in addition to written comments where requested.
Do not include output that is not relevant to the question. You should turn in a .pdf version of your compiled
code.
You may discuss the homework problems and computing issues with other students in the class. However, you
must write up your homework solution on your own. In particular, do not share your homework RMarkdown
file with other students.
Lets apply contrasts to real-world data. The file NutritionStdy.csv contains data on 314 patients undergoing
elective surgery was collected to look at the relationship between the log-concentration of beta-carotene
in the blood (BetaPlasma) and a number of personal characteristics and dietary factors. We will consider
the following five variables observed in this study as predictors for a MLR regression model with response
log(BetaPlasma):
• Quetelet the Quetelet index (Weight/Height2).
• Vitamin 1 = regular, 2 = Occasionally, 3 = No
• NumSmoke Daily number of cigarettes smoked
• Fiber Grams of fiber consumed per day
• BetaDiet Dietary beta-carotene consumed per day
Here we will test some hypotheses about vitamin intake.
a) Read the data in, making sure to specify that Vitamin is a factor and fit a linear model to predict
log(BetaPlasma) from the remaining covariates. Extract the covariate matrix for this model using the
model.matrix command in R with the output of the lm command as an argument.
b) Write down a matrix of contrasts applied to this covariate matrix to test the hypothesis that i) Levels
1 and 2 of Vitamin are the same, and ii) that Level 3 is the same as the average of Levels 1 and 2.
Produce this matrix in R; simply producing the R code with some explanation will suffice.
c) What are the estimated values of the two contrasts defined in the previous part? Verify these
values against the coefficients supplied by lm when you use the additional argument contrasts
= list(Vitamin = "contr.helmert"). Your answers should be 2 and 3 times the coefficients for
Vitamin1 and Vitamin2 respectively. (Bonus explain why this is the case – you will need to look up
Helmert contrasts.)
d) Test these hypotheses using the formuae you derived in Question 3. Why does this not change the F
statistic that you get from the Anova function?