BST 701-001讲解、辅导R编程语言、讲解Statistical Computing、R设计辅导讲解留学生Prolog|讲解SPSS

BST 701-001: Advanced Statistical Computing
Spring 2019
Take-Home Final Exam
Due Date: Wednesday June 12, 2019 at 12:00p (NOON) - upload on Learn
Instructions
Take-Home Final is to be completed individually. If you have questions, please let me know.
I will only answer questions to clarify wording or possible misunderstandings (i.e. I will not
take a look at your code in efforts to debug; therefore, do not send code in your emails).
Please submit your code, with comments followed by a # sign, and or as part of an R markdown file, and submit a separate Word or PDF document with code, plots, answers,
(appropriately labeled; i.e. Part 1, Question a). NOTE: Do NOT submit results of help
files, or str output. I should be able to run your code and not have to manually comment
your output. Also, properly label each Part of the Exam as well as each Question.
1 Problem 1 (10 points)
a) Simulate 500 datasets with n = 100 paired observations (xi, yi), such that
yi = 1.5 + .6xi + i (1)where xi
is normally distributed with mean=3 and SD=1, and is normally distrubited
with mean=0 and SD=.8. Note that this is a simple linear regression model with β0 = 1.5,
β1 = .6, and population correlation ρ = .6 . Store your simulation data in a two matrices
called simx and simy (one line per sample).
b) The goal is to assess the actual coverage probabilities (probability of containing the true
value of .6) of 95% confidence intervals for ρ based on the sample Pearson correlation
coefficient r using Fisher’s Z-transform method. The sample correlation coefficient r
does not have a normal sampling distribution, but a transformation z
0 = .5[log(1 + r) log(1r)], where log is the natural logarithm, has an approximately normal distribution
with standard error = 1/√
3. Using the standard error and the appropriate critical
value (quantile) from the standard normal distribution, a 95% C.I. for z
can be created.
To attain a C.I. for ρ, apply the inverse transformation(2)
to the upper and lower limits of he C.I. Write a function f.confidence (using the code
from part (b) above), that takes the collection of samples and the desired confidence level
as inputs. The output should be the percentage of cases in which ρ = .6 lies within
the confidence interval. What would you expect in theory this value to be? Apply the
1same method to create 99% and 90% confidence intervals and report the actual coverage
probabilities.
c) Next, using your first simulated sample, create a simple bootstrap confidence interval with
B = 10000 resamples. For each resample, compute the untransformed sample correlation
coefficients. Create a histogram for the empirical sampling distribution of r based on your
bootstrap estimates. Compute the upper and lower limits of a 95% confidence interval
for ρ using your bootstrapped values of of r for each resample.
Problem 2 (10 points)
For each simulation below you must:
begin with an initial seed;
comment on every line of the algorithm to describe each action;
not make use of the predefined random number generators in R for each distribution
described below (unless where noted).
a) The Pareto(a, b) distribution has cdf
Develop an algorithm to simulate a random sample of size 1000 from the Pareto(4,2)
distribution. Write out the density of X, and create a histogram that displays this
density. NOTE: The only random number generator allowed for use in this problem is
runif.
b) A discrete random variable X has probability mass function:
x 0 1 2 3 4
p(x) 0.1 0.1 0.3 0.2 0.3
Develop an algorithm to generate a random sample of size 5000 from the distribution of X.
NOTE: The only random number generator allowed for use in this problem is runif. Do
the relative sample frequencies agree closely with the theoretical probability distribution?
c) The Rayleigh density is as follows: Develop an algorithm to generate random samples of size
2000 from a Rayleigh (σ) distribution. Within your algorithm, consider various values
for σ using a for-loop. Display the relationship between each σ value considered and the
random samples drawn. NOTE: The only random number generator allowed for use in
this problem is runif.
2d) Generate a random sample of size 1000 from the Beta(3,2) distribution by acceptancerejection
method. Create a histogram that displays the sample with the Beta(3,2) density
superimposed. NOTE: The only random number generator allowed for use in this problem
is runif.
2 Problem 3 (10 points )
For example 9.2 from Suess and Trumbo (presented in lecture 8), modify the prior distribution
for the mean height difference such that the variance of the prior distribution for μ is
assumed to be 25.
a) Note that the parameters of the prior distribution for θ were selected such that the
prior probability that the standard deviation of height differences is between 0.3mm and
20mm is approximately 95%. Choose new parameters for the prior on θ such that the
prior probability the SD is between .3mm and 50mm is approximately 95%.
b) Assuming the sample mean and SD stay the same, evaluate the mean of the posterior
distribution for sample sizes 10, 20, 30,. . . 90, 100. Plot the value of the posterior mean
vs the sample size. Plot the width of the 95% posterior interval for μ vs the sample size.
c) Explain the results from part b) based on the relationships between the likelihood, prior
for μ, and posterior distribution for μ.
3

BST 701-001讲解、辅导R编程语言、讲解Statistical Computing、R设计辅导 讲解留学生Prolog|讲解SPSS

BST 701-001讲解、辅导R编程语言、讲解Statistical Computing、R设计辅导讲解留学生Prolog|讲解SPSS