首页 > > 详细

辅导数据结构编程、Matlab讲解留学生、讲解C/C++、SQL调试

Page 1 of 8
Introductory Statistics 161.120
Assignment 3
Due date: Sunday 22rd October 2017 Assessment value: 10%
• Your report should preferably be computer produced but there will be no penalty if it is neatly
hand written and scanned.
• This assignment is marked out of 50 marks.
Part A: Time Series Analysis [7 marks]
Google Trends is a web facility, based on Google Search, which shows how often a particular search -term is
entered relative to the total search-volume. Below is a time-series plot produced by Google Trends for the word
‘Holidays’ showing data recorded each month between January 2004 and August 2017. The data are scaled
so the maximum observed number of occurrences is set to 100.
Google Trends for the word "Holidays"
(maximum=100)
Holidays
Page 2 of 8
The original series was smoothed using exponential smoothing with smoothing constant of 0.2 as well as a
6-point moving mean. The table shows the last 12 values for the series along with some of the smoothed
values.
Month Frequency MovingMean ES(0.2)
08/2016 42 39.25 40.61764
09/2016 39 39.25 40.29411
10/2016 37 40.58333 39.63529
11/2016 34 41.33333 38.50823
12/2016 43 41.16667 39.40659
01/2017 54 41.83333 42.32527
02/2017 40 A 41.86022
03/2017 39 43.5 41.28817
04/2017 45 42.41667 42.03054
05/2017 41 42 41.82443
06/2017 41 41.65954
07/2017 43 B
08/2017 46 42.74211

2. Calculate the value of A, the 6-point moving mean value for February 2017, and the value of B, the
Exponential Smoothed value for July 2017. [2 marks]
3. Explain how an exponentially smoothed series with a smoothing constant of 0.5 would differ from
the exponentially smoothed series with smoothing constant of 0.2. [1 mark]

4. Explain why the exponentially smoothed series (regardless of smoothing constant) will not be useful
for predicting numbers of times “Holidays” will be entered into Google in December 2017.
[1 mark]
5. A 6-point moving mean will not remove seasonal effects when smoothing this series. What moving
mean would remove seasonal effects? [1 mark]
Part B: Regression [15 Marks]
The file “CO2vsGDP.csv” contains data on the CO2 Emissions per capita (measured in tonnes) and the GDP
per capita in 2011 (in terms of purchasing power parity in 2013 dollars), for various different countries of
the world.
B1 (a) Use Minitab to carry out a linear regression analysis of CO2EmissionperCapita versus
PerCapitaGDP2011. [ 2 marks ]

Paste your minitab output here
(b) Interpret the coefficients in the regression equation in context. What do they tell us about the
relationship between per capita CO2 and per capita GDP? [ 2 marks ]

(c) In context, explain what the R2 value tells us. [ 1 mark ]

(d) Use Minitab to plot the residuals from the linear regression analysis against PerCapitaGDP2011.
[ 1 mark ]
Paste your minitab output here
(e) Describe any problem(s) with the linear model suggested by this residual plot. [ 2 marks ]

B2. Create a two new variables, lnCO2, containing the natural log of the CO2EmissionsperCapita; and lnGDP,
containing the natural log of the PerCapitaGDP2011.
(a) Use Minitab to carry out a linear regression to predict lnCO2 based on lnGDP. Attach the Minitab
output as well as a plot of the residuals versus lnGDP. [ 2 marks ]
(b) Explain whether this is a better model. [ 1 mark ]
(c) Calculate by hand a 95% confidence interval for the slope for this improved model. You MUST
show your working to get full marks. [ 2 marks ]


B3. The PerCapitaGDP2011 for New Zealand was $32807.64. For a country with this per capita GDP:
(a) Use the regression equation for the improved model (in part B2) to predict the lnCO2. (Note you
need to calculate log GDP).
(b) Convert your prediction back to the original units.
(c) The CO2EmissionsperCapita for New Zealand was 7.12405. Is this above or below what is
expected for a country of New Zealand’s per capita GDP. [2 marks]
Part C: Hypothesis tests for 2 means [15 Marks]
Question C1.
The inventor of an exercise machine for elderly people offers free assessment and a period of free use of the
machine to those who take part in a research study. It is anticipated that use of the machine will increase the elderly
person’s mobility. Twelve elderly people take part. Their mobility was assessed on a percentage scale both before
and after the period of free machine use.
Before After Before After
49 50 39 49
53 59 47 41
45 58 48 52
41 53 47 61
42 44 54 54
36 34 51 59

(a) Is a two-sample t-test or a paired t-test the appropriate test to analyse this data? Explain.
[1 mark ]

(b) Is a one-sided or two-sided test appropriate for this situation? [1 mark ]
(c) State the null and alternative hypotheses in words and/or symbols for the appropriate test.
[ 2 marks ]
H0:

H1:


(d) Calculate by hand the value of the appropriate test statistic. You MUST show your working to get
full marks. [ 2 marks ]

(e) What is the degrees of freedom for the appropriate test and the critical value for a significance
level of α = 0.05? [ 1 mark ]
(f) Is there evidence to reject the null hypothesis? Explain. [ 1 mark ]
(g) State your conclusion in context. [ 1 mark ]
(h) State any assumption(s) needed for this test to be valid. [1 mark ]
C2. Suppose instead the mobility data were from two independent samples of older people. (E.g. Sample 1
is the “before” sample and sample 2 is the “after” sample. )

(a) What is the degrees of freedom and the multiplier t* for the two-sample 95% confidence interval
[ 1 mark ]
Degrees of freedom =

Multiplier t* =

(b) Use the two-sample method to calculate a 95% confidence interval for the difference in mean mobility
between the two samples. You MUST show your working to get full marks. You may use the
following summary statistics. [ 2 marks ]

N Mean StDev
Sample 1 12 46.00 5.59
Sample 2 12 51.17 8.17

(c) State your conclusion about 1 and 2. Is it possible the difference could be zero? [ 1 mark ]

(d) Why do we get a different conclusion in (c) compared to the conclusion in question C1(g)? Give a reason for
your answer. [ 1 mark ]
Part D: Contingency table analysis [13 Marks]

D1. It is frequently commented that incomes differ between males and female. We will look to see if the
relationship is explained by Education level. The dataset IncomevsEducationSex.mtw contains
counts of the number of people in various categories according to the 2103 New Zealand Census.
(a) Use Minitab to produce an appropriate graph that would enable the reader to compare the
distribution of Income for Males and for Females. Give the graph an appropriate title. [ 1 mark ]

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!