MATH3/4/68091 Statistical Computing Coursework 1 (Oct 2018)
To be submitted by 8.00 pm on Sunday 28 October 2018
Please note that the deadline is a strict one with a University set penalty of 10% of the total marks
applied for each day late up to a maximum of ve days, after which your mark for the coursework will
be zero. This coursework is worth 40 marks all together.
Your submitted solutions should all be in one document. This document can be prepared using either
Word or Latex. For each part of the three questions you should provide explanations as to how you
completed what is required, show your working and also comment on computational results, where
applicable.
When you include a plot be sure to give it a title and label the axes correctly.
When you have written or used R code to answer any of the parts, then you should list this R code after
the particular written answer to which it applies. This may be the R code for a function you have written
and/or code you have used to produce numerical results and plots.
If you use Latex to produce your report, you can include R code and output from R using the verbatim
environment. ie. type
nbeginfverbatimg
copy and paste lines of text from R here
nendfverbatimg
Your le should be submitted through the module site on Blackboard to the Turnitin assessment entitled
"SC Coursework 1" by the above time and date. The work will be marked anonymously on Blackboard so
please ensure that your lename is clear but that it does not contain your name and student id number.
Similarly, do not include your name and id number in the document itself.
Turnitin will generate a similarity report for your submitted document and indicate matches to other
sources, including billions of internet documents (both live and archived), a subscription repository of
periodicals, journals and publications, as well as submissions from other students . Please ensure that the
document you upload represents your own work and is written in your own words. The Turnitin report
will be available for you to see shortly after the due date.
This coursework should hopefully help to reinforce some of the methodology you have been studying, as
well as the skills in R you have been developing in the module so far.
1
1. A sequence of n pseudo-randomized observations on the interval [0;1] can be obtained using the
linear congruential generator which works as follows:
yi = (ayi 1 +c) (mod m); ui = yim 1; i = 1;:::;n
for suitable choices of the modulus m, the multiplier a, the increment c and the seed 0 y0
:
8x 0 x< 0:25
8
3(1 x) 0:25 x 1
0 otherwise:
(i) Find the cdf F(x) of X.
[2 marks for derivation]
(ii) Explain how inverting the cdf here can be used to obtain a random sample of size n from X.
Show algebraically how your procedure will be implemented.
[3 marks - 1 for explanation + 2 for algebra]
(iii) Write a function in R for obtaining a random sample of size n from X using the methodology
you have described in part (ii) above.
[4 marks for code]
(iv) Use your function to obtain a random sample of size n = 10000 from X. Construct a histogram
for your sample data and superimpose the true pdf f(x) given above onto the histogram plot.
Comment subjectively on the goodness-of- t.
[3 marks - 2 for histogram + 1 for comment]
[12 marks for Q2]
3
3. Suppose that the random variable X has a probability density function (pdf) of the form
f(x)/
(
x2(5 x) sin2(2x) 0 x 5;
0 otherwise:
[Note that R50 (x2(5 x) sin2(2x))dx = 26:22476.]
It is proposed that samples from X are obtained using rejection sampling as follows:
{ Draw u U(0;5) and v U(0;A).
{ If v u2(5 u) sin2(2u), accept u as an observation from X. Otherwise reject u.
{ Repeat until a sample of the desired size is obtained.
(i) Determine a suitable value for A. Note that, if you try to use the standard calculus method
for this then it is very di cult to construct and solve the equation ddx(x2(5 x) sin2(2x)) = 0.
You might therefore choose an alternative approach, which could be graphical or otherwise,
to nding a suitable A.
[2 marks for a good suggestion (with reasoning) for the value of A]
(ii) Write a function in R to simulate from X using the above rejection sampling procedure. Your
function should also calculate an estimate of the e ciency of your algorithm.
[5 marks for code]
(iii) Run your R function to obtain a sample of size n = 10000 from X. Use the results to plot
an estimate of the density of X using plot(density()) with default settings. You should
also include a plot of the true pdf, f(x), given in the information above. Comment on the
goodness-of- t of the density estimate to the true f.
[5 marks - 1 for correctly running code + 3 for plot + 1 for comments]
(iv) Calculate the true e ciency of your algorithm analytically and then compare your estimate,
previously calculated by running your function, with its value.
[2 marks - 1 for calculation + 1 for comment]
[14 marks for Q3]