General instructions: Final must be completed as an R Markdown file. Be sure to include your name
in the file. Give the commands to answer each question in its own code block, which will also produce
plots that will be automatically embedded in the output file. Each answer must be supported by written
statements as well as any code used. The final exam is open book/internet access, but absolutely
no communicating with other humans. Any questions you have must be directed to me.
Part I - Flour beetle population
The table below provides counts of a flour beetle population at various points in time. Beetles in all stages of
development were counted, and the food supply was careful controlled.
Days (ti) 0 8 28 41 63 79 97 117 135 154
Beetles (Nobs(ti)) 2 47 192 256 768 896 1120 896 1184 1024
An elementary model for population growth is the logistic model given by
N(t) = 2K2 + (K−2)exp{−rt}
where N(t) is the population size at time t, r is a growth parameter and K is a parameter that represents
the population carrying capacity of the environment. A popular method to estimate the parameters (K,r) is
to minimize the objective function
g(K,r) =
nsummationdisplay
i=1
parenleftbigN(t
i)−Nobs(ti)
parenrightbig2
=
nsummationdisplay
i=1
parenleftbigg 2K
2 + (K−2)exp{−rti}−N
obs(ti)
parenrightbigg2
with respect to K and r. Here n represents the sample size, and ti take the values 0,2,8,28,... (see the table
above).
1. Evaluate the function g(K,r) over an appropriately chosen two-dimensional grid. Produce a surface
plot using the function persp().
2. Based on the results of part (1), provide estimates ( ˆK,ˆr) of the parameters (K,r), which will minimize
the function g.
3. In many population modeling applications, an assumption of log-normality is adopted: log(Nobs(t)) are
independent and normally distributed with mean log(N(t)) and variance σ2 = 1. Design a Monte Carlo
approach to estimate the sampling distribution of the estimates found in (2). Implement your approach
and display histograms for the sampling distributions for ˆK and ˆr.
Part II - Pine needles
In the article “Pine needle as sensors of atmospheric pollution”, the authors use neutron-activity analysis to
determine pollution levels, by measuring the Bromine concentration in pine needles. The investigators collect
18 pine needles from a plant near an oil-fired steam plant and 22 near a cleaner site. The data can be found
at [http://faculty.ucr.edu/~jflegal/206/pine_needles.txt].
1
4. Describe the data using plots and summary statistics. Show that the data are not normally distributed
by drawing an appropriate graphical display for each sample.
5. Take a log transformation of the values in each sample. Does it seems reasonable that the transformed
samples are each drawn from a normal distribution? Test this formally using an appropriate test (of
your choosing).
6. Now suppose that the authors of this study want to calculate an interval for the difference between the
median concentrations at the two sites, on the original measurement scale. Write code to calculate a
95% bootstrap interval for the difference in the medians between the two samples. Summarize your
conclusion in words.
Part III - Metropolis Hastings
Suppose we have observed data y1,y2,...,y200 sampled independently and identically distributed from the
mixture distribution
δN(7,0.52) + (1−δ)N(10,0.52).
7. Simulate 200 realizations from the mixture distribution above with δ = 0.7.
8. Draw a histogram of the data that also includes the true density. How close is the histogram to the
true density?
9. Construct kernel density estimates of your 200 realizations using the Gaussian and Epanechnikov kernels.
How do these compare to the true density?
10. Now assume δ is unknown with a Uniform(0,1) prior distribution for δ. Implement an independence
Metropolis Hastings sampler with a Uniform(0,1) proposal.
11. Implement a random walk Metropolis Hastings sampler where the proposal δ∗ = δ(t) +epsilon1 with epsilon1∼
Uniform(−1,1).
12. Comment on the performance of the independence and random walk Metropolis Hastings samplers
including at least one relevant plot.