讲解EECS6127编程、辅导program程序设计、讲解R、Java编程辅导R语言程序|辅导R语言编程

EECS6127–Winter 2020–2021
Assignment 1
Due: 11:59:59 PM, February 6, 2021.
A full score on this assignment is 25 marks. However there are 28 marks achievable on
this assignment (up to 3 bonus marks). To gain full marks, explain your answers carefully
and prove your claims.
1. Bayes classifier and Bayes risk
Recall that we model the data generation as a probability distribution D over X ×
{0, 1}. D can be defined in term of its marginal distribution over X , which we will
denote by DX and the conditional labeling distribution, which is defined by the
regression function
µ(x) = P
(x,y)∼D
[y = 1 | x]
Let’s consider a 2-dimensional Euclidean domain, that is X = R
2
, and the
following process of data generation: The marginal distribution over X is uniform
over two square areas [1, 2] × [1, 2] ∪ [3, 4] × [1.5, 2.5]. Points in the first
square Q1 = [1, 2] × [1, 2] are labeled 0 (blue) and points in the second square
Q2 = [3, 4] × [1.5, 2.5] are labeled 1 (red), as in the illustration below.
(a) Describe the density function of DX, and the regression function, Bayes predictor
and Bayes risk of D.
(b) Consider the two distributions D1 and D2
that we obtain by “projecting” onto
each of the axes. Formally, we are marginalizing out one of the features to
obtain D1 and D2
. Both are distributions over R× {0, 1}. Describe the density
functions of D1
X
and D2
X
, and the regression functions, Bayes predictors and
Bayes risks of D1 and D2
.
1
(c) Consider the hypothesis classes Hinit and Hdecst from Question 2 on this assignment.
Determine the approximation errors of Hinit for D1 and D2 and the
approximation error of Hdecst for D.
(d) For the classes Hinit and Hdecst consider their closures under function complements
and again determine the approximation errors on D, D1 and D2
.
3 + 3 + 3 + 3 marks
2. VC-dimension
We define hypothesis classes Hinit of initial segments over domain X = R and Hdecst
of decisions stumps over domain X = R
2 as follows:
Hinit = {ha | a ∈ R}, where ha(x) = 1[x ≤ a] ,
and
Hdecst = {h
i
a
| a ∈ R, i ∈ {1, 2}}, where h
i
a
((x1, x2)) = 1[xi ≤ a] .
Further, for a hypothesis class H ∈ {0, 1}
X , we define the class of complements of
H, denoted by Hc
, as the class where we flip all the labels of the functions in H,
that is
H
c = {h
c
| h ∈ H} where h
c
(x) = |h(x) − 1|.
Finally , we let Hcc = H ∪ Hc denote the closure of H under complements.
(a) Determine the VC-dimensions of Hinit and Hdecst.
(b) Show that for every hypothesis class H, we have VC(H) = VC(Hc
).
(c) Determine the VC-dimensions of Hcc
init and Hcc
decst.
(d) Show that, for every k, there exists a hypothesis class H with VC-dimension
k and VC(Hcc) = 2k + 1.
Hint: consider domain X = N and classes
Hk = {h ∈ {0, 1}
X
| |h
−1
(1)| ≤ k}
That is Hk is the class of functions that map at most k natural numbers to 1
and the remaining ones to 0.
3 + 3 + 3 + 3 marks
3. Empirical and true risk
Recall Claim 2, from Lecture 2. We showed that, over an uncountable domain,
there exists a learner A (the “stubborn learner”) and a distribution D, such that
for every sample size m and all samples S from Dm:
|LS(A(S)) − LD(A(S))| = 1
This is not true for countable domains (as we will show later in the course). For
this exercise, we will show a similar, but weaker statement for countable domains.
Without loss of generality, we can assume X = N. Prove that, for every sample size
m, and every > 0, there exists a learner A and a distribution D over X × {0, 1},
such that for all samples S from Dm we have
|LS(A(S)) − LD(A(S))| ≥ 1 −