CS5487: Take-home quiz

2019 Semester A

Dec 12 to Dec 18

Rules:

1. This take-home quiz is an “open-book” quiz. You are permitted to use

the following materials during the quiz:

• Your lecture notes.

• Your cheatsheet from the Midterm Quiz.

• The textbook, Pattern Recognition and Machine Learning (PRML)

by Bishop.

• Any materials available on the CS5487 Canvas course page, including Problem Sets, Problem Set Solutions, Tutorial Solutions,

Panopto Recordings.

All other materials are NOT allowed. This includes web searches, research papers, and other reference books, etc.

2. You cannot discuss the quiz with others, and the work that you turn in

must be your own work. You will follow the high standards of Academic

Honesty at CityU.

3. You have until Dec 18, 5pm to complete the quiz. Turn in your work on

Canvas.

Instructions:

1. Answer all questions on blank paper.

2. On the last page of your answer sheets, write the following statement:

“The work in these answer sheets are my own work. I have not discussed

this quiz with anyone else. I have only used the allowed materials.” Then

write your name, student number, date and put your signature.

3. Upload your answer sheets to Canvas.

Problem 1 Soft Adaptive-SVM [30 marks]

In this problem we will consider an adaptive-SVM (ASVM) for binary classification. Suppose

we have used a dataset D0 to learn a binary linear classifier function f0(x) = wT0 x with

decision rule y = sign(f0(x)). Since we have the classifier, we then discarded the data D0.

Now, suppose we receive a new set of data D = {(xi

, yi)}ni=1, where xi ∈ Rd are the

feature vectors and yi ∈ {+1, ;1} the corresponding class. We wish to update our original

classifier function f0(x). To do this, we will add a “delta classifier” ∆f(x) = wT x to adapt

our original classifier f0(x) into a new classifier f(x),

f(x) = f0(x) + ∆f(x) = f0(x) + wT x, (1)

where w is the parameter vector of the “delta classifier”. To handle cases when the data is

not linearly separable, we introduce slack variables ξi

for each data point xi

. The ASVM

primal problem is

min

w 12 kwk2 + CXi ξi s.t. yi(f0(xi) + wT xi) ≥ 1 1 ξi, ∀i,

ξi ≥ 0, ∀i.

(2)

(a) [2 marks] Explain the role of the objective function and the constraints in the ASVM

primal problem.

(b) [5 marks] Write down the Lagrangian L(w, ξ, α, r) for the ASVM primal problem, where

α are the Lagrange multipliers for the first set of inequality constraints, and r are the

Lagrange multipliers for the second set of inequality constraints. Derive conditions for

the stationary point of L(w, ξ, α, r) w.r.t. w and ξ.

(d) [3 marks] Use the KKT conditions to derive a geometric interpretation of the ASVM.

(e) [10 marks] Compare the ASVM dual in (c) with the original soft-SVM dual problem.

What is the interpretation of the ASVM dual (considering the original SVM dual)?

What is the role of the original classifier f0(x)?

. . . . . . . . .

Problem 2 Gaussian variance regression [50 marks]

Consider the regression problem where x ∈ Rd

is the input vector and y ∈ R is the observation

value. The training set is D = {X, y}ni=1, where X = [x1, · · · , xn] are the input vectors, and

y = [y1, · · · , yn]T are the output values.

In this problem, we will consider a Gaussian observation model with fixed mean µ = 0

and variance σ2

that changes as a function of x. That is, our goal is to regress the variance

of the Gaussian using the inputs X and the corresponding observations y. The Gaussian

observation likelihood with mean 0 is

p(y|σ2

) = 1 √2πσ2 e쎌 12σ2 y2 . (3)

Since the variance should be non-negative, we define the mapping from x to the variance σ2

as the exponential of a linear function

σ2(x) = e∗wT x, (4)

where w ∈ Rd

is the parameter vector. Thus, the observation likelihood in terms of w, x is

given by

p(y|w, x) = 1 p2π(ewT x)e쎌 12 (ewT x)y2 , (5)

We also assume a Gaussian prior on the weight vector w, p(w) = N (w|0, Σ). (6)

First we will consider the MAP estimate of the regression parameters w.

(a) [5 marks] Describe a real-world problem where this type of regression could be used.

(b) [5 marks] Write down the optimization problem for the MAP estimate of w.

(d) [5 marks] Consider the case when the prior covariance matrix is Σ = λI. How does Σ

help to regularize the estimate of w?

Now we will consider a non-linear version by kernelizing the regression model.

(e) [5 marks] Derive the kernel version of the regression model, i.e., let σ2∗ = e∗α∗

, and apply

the kernel trick to calculate α∗ = wT x∗.

(f) [10 marks] Derive the kernel version of the MAP estimation using the Newton-Raphson

iterations derived in (c).

(g) [5 marks] Discuss the role of the prior covariance Σ in the kernel regression model.

(h) [5 marks] Compare the original and kernelized algorithms in (c) and (f). What are the

advantages and disadvantages of each version?

联系我们

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

热点文章

辅导 comm2000 creating socia... 2026-01-08
讲解 isen1000 – introductio... 2026-01-08
讲解 cme213 radix sort讲解 c... 2026-01-08
辅导 csc370 database讲解迭代 2026-01-08
讲解 ca2401 a list of colleg... 2026-01-08
讲解 nfe2140 midi scale play... 2026-01-08
讲解 ca2401 the universal li... 2026-01-08
辅导 engg7302 advanced compu... 2026-01-08
辅导 comp331/557 – class te... 2026-01-08
讲解 soft2412 comp9412 exam辅... 2026-01-08
讲解 scenario # 1 honesty讲解... 2026-01-08
讲解 002499 accounting infor... 2026-01-08
讲解 comp9313 2021t3 project... 2026-01-08
讲解 stat1201 analysis of sc... 2026-01-08
辅导 stat5611: statistical m... 2026-01-08
辅导 mth2010-mth2015 - multi... 2026-01-08
辅导 eeet2387 switched mode ... 2026-01-08
讲解 an online payment servi... 2026-01-08
讲解 textfilter辅导 r语言 2026-01-08
讲解 rutgers ece 434 linux o... 2026-01-08

热点标签

msinm014/msing014/msing014b

联系我们 - QQ: 99515681 微信：codinghelp

程序辅导网！