首页 > > 详细

讲解R编程、R讲解、调试R语言程序、Matlab编程调试

AUT International Foundation Certificate
Delivered by ACG Norton College
FOUNDATION STATISTICS Term 4 2017 ASSIGNMENT 1
NAME:…………………………………………………………………. ID:………………………..
DUE: Thursday 26th October , 2017

 Assignments handed in late will gain 0%.
 Hand in a printed paper hard copy.
 Draw any graphs with a ruler.
Mark Scheme
Describing distributions, Normal Curves 70


Practical Test using R 30
TOTAL 100


……………………%
Section A [33]
This section is to be completed using your scientific calculator for the required workings.
1. Consider the data set Corporate below, containing the body corporate fees ($) for 21
downtown 2-bedroom apartments.

6247, 10851, 7374, 6183, 8850, 8001, 8914, 8604, 7505, 6700, 8380, 6813, 8745,
7696, 6473, 8001, 6615, 6640, 6333, 8306, 8063.


a) Specify if the variable Corporate is Quantitative or Categorical. Explain your answer. (2)
HINT: Think about the variable!
b) When data is collected for the survey. Give an example of one categorical variable that could be
collected from the same sample of apartments. (1)

c) Make a split stemplot of the Corporate data. Make the stem $1000’s and leaf $100’s.
Remember to include a Key. (4)

d) Describe the shape of this distribution. (2)

e) Calculate the following statistics for the raw data values: (10)

range = median = lower quartile, Q1 =


upper quartile, Q3 = mean = standard deviation =


20th percentile = Interquartile range =
3


f) Give the five-number summary of the raw Corporate data. (3)

Put your results in the table:


g) Draw a horizontal box and whisker plot of the raw Corporate data. (3)


6 7 8 9 10

h) The Corporate value of $10851 is a suspected outlier, possibly due to a transcription error. If
the outlier was removed from the data set explain the effects on the mean and median. (3)

i) If it is known that in a large city 2-bedroom apartment body corporate fees are approximately
normally distributed with a mean of $7500 and a standard deviation of $670.

i) Include an appropriately labelled normal density curve for the variable for the fees. You
may sketch this with a pencil. (3)

ii) Use the 68-95-99.7 rule to answer this question. (2)

What are the limits for the middle 95% of all fee values? [ ………… , ………… ]
$ 000’s

Section B [37]
This section is to be completed using R Commander for the required workings and should be typed
on the computer.
Cowles and Davis's Data on Volunteering
These data come from a study of the personality determinants of volunteering for psychological
research. The variables are:
Neuroticism scale from Eysenck personality inventory
Extraversion scale from Eysenck personality inventory
Sex a factor with levels: female; male
Volunteer volunteering, a factor with levels: no; yes

Load to R Commander using the following pathway: Open R and load R Cmdr.

Data data in packages read data set from an attached package…. Package “car”

Dataset “Cowles”


1. Consider the variable Volunteer.

a) Make a Pie chart. (1)
b) Using your chart comment on the main features of this variable. (2)

c) Name the quantitative variables in the data set. (2)
d) From your Pie chart estimate the percentage of respondents in the survey
that have a status of No. Show the working for your calculation. (2)
2. Consider the variable Neuroticism from the respondents.

a) Make a side-by-side boxplot to compare the 2 sexes. (2)
b) Using the boxplot, which sex is least likely to be called right skewed in shape? (1)
c) Which sex has the smallest range? (1)

d) What number of males and females are in the survey? (2)
e) Compare the centre and spread of the distributions for the two gender groups of
respondents for the variable Neuroticism. Include numerical summary statistics
from R and summarise by sex. Use appropriate statistics in your written
comparison. (6)

3 a) Use the 1.5IQR rule and determine if there are any outliers in the data for the
variable Extraversion. You do not need to summarise by either of the
categorical variables.
Include a numerical summary from R. Show all workings and conclusion. (6)

b) Make a histogram of the variable Extraversion with 5 bins. (1)

c) i) Estimate the range of the data from your histogram. (1)
ii) What is the actual range for Extraversion? (1)

d) Using your Histogram, how many respondents have a level of Extraversion less
than 10. Show working? (2)

4. The data set, Bfox, contains data on Canadian women's labour-force participation for the period
1946–1975.

Load to R Commander using the following pathway.

Data data in packages read data set from an attached package….
Package “car” Dataset “Bfox”

The variable trf is, the Total fertility rate: expected births to a cohort of 1000 women at
current age-specific fertility rates.

In a written report using sentence structures, describe the distribution of the variable trf ,
mentioning its shape, center and spread.

You must produce at least one graph from R, together with a numerical summary and
appropriate statistics from R Commander to answer this question. Justify the use of the
statistics used. (7)

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!