Data Inspection讲解、辅导Java/Python编程语言、讲解F-statistic留学生
辅导留学生 Statistics统计、回归
1. Data Inspection and Statistical Inference:
1.(a) How many variables and observations does the dataset contain Which variables are dummy variables Which variables are categorical variables (more than 2 categories)?
2.(b) Show price, the dependent variable, as a histogram. Describe the distribution.
3.(c) Determine the relationship between sqft and price, and garage type and price.
4.(d) Show saledate as a histogram (because it is a time variable, declare how you want your bins divided, hist(saldate, "break"), where break is days, weeks, months, or years). Is there any useful information here?
2. Model Selection and Output
1.(a) Construct your model by theory and statistical inference. What are your determinants of sale price? Are there determinants that are missing from the dataset (you may need to you can create new variables). Present your regressors in a table and explain why you include them (include what your expected effect of each is on your price).
2.(b) Are there any variables missing from the dataset? Which ones may cause omitted variable bias?
3.(c) Run your regressions and diagnoses to determine if there are any OLS violations. Is heteroskedasticity
present Demonstrate how you came to your conclusion. If heteroskedasticity is present, use robust standard
errors (> library(sandwich) > coeftest(regressionname, vcov=vcovHC(regressionname, type = ‘‘HC1’’))). Are there any other clear patterns?
4.(d) What is your choice model(s). Defend your model selection. Present the results in a table. What corrections did you make to your initial theorized model, and why did you select it/them?
5.(e) Interpret all your the coefficients (mainly dummy variables, and where you use natural log transformation, or polynomials).
6.(f) Provide an explanation for any surprising or counterintuitive coefficients.
3. Hypothesis Testing (at the αmax = 0.05 significance level) for your selected model.
1.(a) What variables are statistically significant?
2.(b) Interpret the R2, and the F-statistic.
(c) Test the null hypothesis that only the physical characteristics of the house matter (i.e. jointly test if all variables you include other than the physical characteristics like square footage, number of bathrooms, etc. are equal to zero). Interpret the result.