辅导Java程序、Processing讲解、辅导php语言程序、C/C++编程辅导

Problem 1 - Comparing PM 10 values in Milan and Rome, Italy
Fine-particulate matter with a diameter of 10 microns or less (known as PM 10) is an air
pollutant which has been associated to several health problems in humans, in particular
with increased heart attacks and worsening cardiovascular diseases. Researchers now know
that both short-term and long-term exposures to PM 10 are problematic. In many euro-
pean countries, observed PM 10 values over a certain threshold are usually associated to
regulatory intervention (e.g., trac limitations). For example, in the city of Milan, Italy,
the circulation of diesel cars was blocked for multiple days in October 2017 based on data
collected from October 18th to October 21th, because the PM 10 values were considered
over the regulatory threshold of 50 µg/m.Theblockcameintoe↵ectonOctober24th.
The following dataset considers PM10 data from monitoring stations in two italian cities:
Milan (two stations) and Rome (three stations).
rm(list=ls())
## PM 10 data for two monitoring stations in Milan and three in Rome
PM10=read.csv(paste("http://www.ics.uci.edu/~mguindan",
"/teaching/introBDA/data",
"/Stations_italy.csv", sep=""), as.is=T)
head(PM10) ## first lines of the dataset
## day Rome1 Rome2 Rome3 Milan1 Milan2
## 1 10/1/17 41 39 38 30 34
## 2 10/2/17 39 33 32 48 52
## 3 10/3/17 34 39 30 63 68
## 4 10/4/17 37 34 32 81 82
## 5 10/7/17 10 11 7 22 21
## 6 10/8/17 14 14 12 23 23
(5 points) For now, consider only the two stationsin Milan.
Write a Bayesian model (in formulas!) that the regulators in Milan can use to obtain
inference on the PM10 levels observed from the monitoringstationsin the city.
The model should reflect the following information:
- The two stations in Milan are fairly close; hence, the levels in the two stations could
be considered reflecting similar weather/pollution conditions.
- Regulatorsare interestedin capturingdaily patterns in the PM10 levels across the
stations in Milan
- Regulatorsare also interestedin capturinga monthly summary of PM10levels for
October in Milan
(3 points) Compute the posterior mean of PM10 concentration levels in Milan for the
entire month of October, and corresponding 95% credible interval. Comment your re-
sults.
11. (3 points) Plot the posterior density of PM10 concentration levels in Milan for the entire
month of October. Comment the results.
12. (3 points) Computethe posteriormean of the variance of the samplingdistributionthat
characterizes the daily PM10 concentration levels in Milan. Comment the results.
13. (5 points) Now consider also the data from the three stations in Rome. Rome is 356
miles far from Milan, in separate geographic regions (Lombardy and Lazio) within the
same country (Italy). Write a Bayesian model (in formulas!) that hierarchically com-
bines the data from the two cities. You want to take into consideration the following
information(which may add to the informationprovided above):
Page 3
- The three stations in Rome are fairly close; hence, the levels in the three stations
could be considered reflecting similar weather/pollution conditions.
- Regulatorsare interestedin capturingdaily patterns in the PM10 levels across the
stations in Rome (and Milan)
- Regulatorsare also interestedin capturinga monthly summary of PM10levels for
October in Rome (and Milan)
- Regulatorsarealso interestedin capturingregional e↵ects in addition to country
level e↵ects.
- Historically, the average PM 10 levels across all monitoring stations in the city
of Rome in October have been around 30 µg/m
3
with a standard deviation of 10
µg/m
3
Justify your modeling choice.
14. (3 points) Computethe posterior mean for the PM10 concentrationlevels in Milan and
in Rome for the following dates: 10/22/2017 and 10/24/2017. Comment your results.
15. (3 points) Computethe95%posteriorcredibleintervalsfor thePM10 concentrationlev-
els in Milan and in Rome for the following dates: 10/22/2017 and 10/24/2017. Comment
your results.
16. (3 points) Plot the posterior densities for the PM10 concentration levels in Milan and
in Rome for the following dates: 10/22/2017 and 10/24/2017. Comment your results.
17. (3 points) Dr. Guindani was planning to visit Italy on 11/1/2017. Plot the posterior
predictive densities for the PM10 concentration levels in Milan and in Rome for that
(future) day. What’s the probability that the PM10 concentration level in Milan are 15
µg/m
more than in Rome (i.e. computethe probability that y
[Hint: If you have set up the model correctly, this is a bit di↵erent than what you
have done in the past. You may need the following result: if X|µ,
Problem 2 - Analysis of Bayesian Phase II Clinical Trials
Clinicaltrialsareprospectivestudiesthatareconductedtoevaluatethee↵ectofinterventions
in humans under prespecified conditions. They have become a standar and integral part of
modern medicine. A properly planned and executed clincial trialis the most definitetool for
evaluating the e↵ect and applicability of new treatment modalities. Clinical trials usually
comprise multiple phases. PhaseI studies are meant to assessthe safety (toxicity) profile of a
new drug, and usually they involve only a few patients. Phase II studies, instead, are meant
to provide an initial assessment of a drug’s ecacy.Theyareusuallyrelativelysmallstudies,
conductedata fewinstitutions, witha limitednumberofpatients being treatedwith the drug
(say, between 40 and100as a ruleof thumb). Phase III studies areconfirmatorytrials, where
the investigators seek to replicate successful results obtained in the smaller Phase II trials
with a larger number of patients. Phase III studies are usually conducted nationwide. See,
for more details, http://www.fda.gov/ForPatients/Approvals/Drugs/ucm405622.htm
Phase IIA trials
Phase IIA trials are single-arm studies, i.e. studies where the investigators are interested
in assessing the ecacy of a single new drug, in order to understand if it is e↵ective. One
typical measure of ecacy is the response rate,sincetheprimaryendpointofaPhaseIItrial
is response/no response to treatment (or success/failure).
We consider a study that was designed to test the ecacy of a new drug A.
The study enrolled 50 patients in a first stage. Suppose that 30 patients respond to the new
therapy.
(3 points) At the end of the first stage, the study design had planned for an interim
analysis,todetermineifthereweresomeevidenceoftheecacyofthecombination
drug. In the design of clinical trials, stopping rules are often planned to discontinue a
study in case there is strong evidence that the new drug under study does not provide
any sensible improvement with respect to the existing standard of care (futility), or
instead that the improvement provided by the drug is so good that the drug can be
Page 5
tested in a larger study (ecacy). According to the study design, this clinical trial was
supposed to stop for futilityif at the end of the first stage, the probability of a response
rate smaller than 0.5 was more than 0.9. Based on the data from the first stage, would
you stop the clinical trial at the interim analysis? Motivate your answer.
(3 points) According to the study design, this clinical trial was supposed to stop for
ecacyand“graduate”thedrugto a largerPhaseIII trial,if at theendof thefirststage,
the probability of a response rate greater than 0.7 (which was deemed an interesting
response rate according to the investigators) was more than 0.9. Based on the data
aquired in the first stage, would you recommend stopping the trial early for ecacy at
the end of the first stage? Motivate your answer.
7. (3 points) In a second stage, the trial accrues an additional 50 patients. 32 patients
responded to the combined therapy. What prior distribution would you use to analyze
such data?
8. (3 points) Write an rjags (or rstan) programto analyze all the combined data from the
first and second stage of the trial at once. The drug is proposed to the FDA for a further
larger Phase III study if there is enough evidence that the response rate is greater or
equal than 0.51. What would you recommend to the FDA based on the data you have
collected? [Hint: enough evidence doesn’t mean strong evidence, just that the threshold
is passed]
Phase IIB trials
PhaseIIBtrials are multi-arm studies to compare the ecacyof newdrugs, with the goal
of screening out those that are ine↵ective. The multiple arms could be di↵erent treat-
ments (possibly including a control arm, such as “standardof care”), di↵erent doses or
schedules of the same agent, or any combination of such comparisons. Typycally, Phase
IIB trials are randomized, that is patients are assigned to any of the arms following a
randomization scheme. Here, we do not consider the di↵erent randomization schemes.
Instead, we are interested to assesshow a drug compares with another, after the patients
have already been assigned to each arm, and the outcome is recorded.
We consider a multi-institution trial of two drug compounds: say, drug A and a combi-
nation of drug A with another drug compound B. See Thall & Wathen (2007), Practical
Bayesian Adaptive Randomization in Clinical Trials, Eur J. Cancer,forthedescription
of the precise Bayesian methodology used in this type of trials (which we only slightly
simplify here).
The investigators were interested in the probabilities of overall treatment success: ✓
, respectively, in each of the two arms of the study: arm “A”and arm “Combo
A+B”. At the end of the trial, 80 patients have been assigned to arm A and 65 patients
to arm B (again,we do not bother here about the randomization scheme, but only look
Page 6
at the samplesize collectedin each cohort). The primaryend point (treatment success)
is reached by 42 patients in Arm A and 37 patients in Arm B.
(3 points) Propose a model to analyze these data (in formulas!). You want to make
sure that the treatment e↵ects are defined specifically for each arm, but the probability
of success in arm “A” and arm “Combo” may be related due to the presence of some
elements of drug A in both arms. For the prior specification, you may want to consider
the following information: based on the current standard of care, the mode response
rate should be around around 30%, and less than 0.6 with high probability (say, 90%)
for both treatments. In addition, we want to base our judgement on the trial data only,
so the information provided by the available prior data (prior sample size) could be
considered as a positive random variable, distributed as a logNormal(0,3).
Combo
.Summarizetheposterior
distributionby using the posterior means, and 90% posterior credible intervals.(3 points) Based onthedatacollected inthetwo arms, wouldyourecommendthecombo
drug (in the “Combo” arm) for a larger Phase III study? What would you base your
decision upon?(3 points) It is often the case that the decision to recommend a trial for a larger Phase
III study is based on a simulation where one would try to assess how the trial would
perform. in the larger future study. Suppose that a larger study would enroll 500 total
new patients, equally assigned to each of the two arms. Summarize the results of the
simulation of the larger study, by means of the mean and 95% credible intervals of the
posterior predictive distributions for both arms.