首页 > > 详细

STAT 331 Case #2 Linear Regression

 STAT 331 Case #2 Linear Regression

 
Each group will submit ONE WORD copy on Blackboard by the deadline indicated on the syllabus (also on Blackboard), with subject: STAT331 HW#-group#. Please include a cover page with all your group members.
 
1. Predicting Wine Quality
Descriptions:
This dataset is related to red vinho verder wine samples from the north of Portugal. The goal is to model wine quality(V12) based on physicochemical tests. Each row of the data set corresponds to a wine sample. Descriptions for the data follow:
V1 fixed acidity
V2 volatile acidity
V3 critic acid
V4 residual sugar
V5 chlorides
V6 free sulfur dioxide
V7 total sulfur dioxide
V8 density
V9 pH
V10 sulphates
V10 alcohol
V12 quality (score from 1-10)
 
 
Random sample a training data set that contains 90% of the original data points.  
(i) Start with exploratory data analysis (descriptive statistics, scatter plots, histogram, boxplot). 
(ii) Conduct linear regression on the training data (To better interpret regression results, do not normalize data). 
(iii) Conduct variable selection on the training data using stepwise. Find the best linear model. Show residual diagnosis. Briefly interpret the regression result (which variables are significant? Are they positively or negatively associated with the wine quality?).
(iv) Test the out-of-sample performance. Using final linear model built from (iii) on the training data, test with the remaining 10% testing data. Report out-of-sample model MSE etc. 
(v) Cross validation. Use 10-fold and leave-one-out cross validation. Does (v) yield similar answer as (iv)?
(vi) Now repeat steps(ii)~(v) on the normalized training data. Do you get similar results? What’s your conclusion? (Can we attain better prediction performance by normalization at the cost of interpretation?)
 
Write a brief report including all labeled figures and tables. 
 
 
 
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!