首页 > > 详细

讲解graphics留学生、辅导c/c++,Python、Java程序设计调试讲解数据库SQL|解析C/C++编程

Mid-term Project of 6289 (Due: Oct 29, 2019) Name:
• You need to submit your answers before 2:00pm of Oct 29.
• You may talk with one another about the project, but the work you turn in should be
your own.
• Use a word processor that can handle mathematics (like LATEX or word) and can
include graphics. No handwriting is accepted.
• Finalize your codes (with file name: yourname-6289.*) and email me a copy
for verifying
1. (80) Duchenne Muscular Dystrophy (DMD) is a sex-linked genetic disease. Boys with
the disease usually die at a young age, while affected girls usually do not suffer symptoms
and may unknowingly carry the disease and pass it to their offspring. It is
desirable to have some kind of test to detect whether or not a woman is a carrier of
the disease. The dataset dystrophy.txt contains information from a 1981 study attempting
to develop such a test based on two serum enzymes, creatine kinase (CK) and
hemopexin (H) for 38 known DMD carriers (Case) and 82 women who are not carriers
(Control). (Note: In the last 30 years, advances in DNA sequencing technology has
made it possible obtain definitive answers; however, tests based on the above proteins
are still used as rapid and inexpensive alternatives).
(a) Use logistic regression to model the way in which case/control status depends
on creatine kinase and hemopexin. Construct (using the Wald approach) a table
containing the estimated odds ratios and p-values for the two enzymes. Provide
confidence intervals for the odds ratios, and give some thought as to what would
constitute a meaningful difference (δj ) for the two enzymes when calculating the
odds ratios.
(b) Can you calculate confidence intervals for the odds ratios in part (a) using the
likelihood ratio approach? If so, calculate them. If not, explain why you can’t do
so.
(c) Can you carry out the hypothesis testing in part (a) using the likelihood ratio
approach? If so, perform the tests. If not, explain why you can’t do so.
(d) Describe (quantitatively) the relationship between creatine kinase levels and the
likelihood that a woman is a carrier without using the phrase “odds ratio” (you
can use “odds”, just not “odds ratio”).
(e) Suppose a woman randomly selected from the population has a hemopexin level
of 100 and a creatine kinase level of 150. Can you estimate the probability that
she is a carrier? If so, estimate it. If not, explain why you can’t do so.
(f) It is estimated that 1 in 3,300 women are carriers. Treating this as a known
constant, calculate the sampling ratio τ1/τ0.
(g) Based on your answer to (f), calculate the probability from part (e).
(h) Compare1 the following three numbers: (i) the probability you calculated in (g),
and (ii) the marginal probability of being a carrier (i.e., if you don’t know a
woman’s hemopexin/creatine kinase levels).
2. (20) Consider a binary response variable Y and logistic regression. We focus on the
group Lasso with loss function given by the negative log-likelihood as (see equation
(3.3) in the HDDA book as well)
Write the block coordinate gradient descent algorithm (Algorithm 3 in the HDDA
book) with explicit formulae (see equations (4.20) and (4.21) in the HDDA book,
where 0 < δ < 1, 0 < σ < 1, and ∆[m]
is the improvement in the objective function
Qλ(·) when using a linear approximation for the objective function, i.e.,

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!