首页 >
> 详细

Pennsylvania State University

College of Information Sciences and Technology

DS 310 Midterm Exam

Instructions

1. There are 4 problems each of which is worth 25 points.

2. Please consult the instructor if you have difficulty understanding any of the problems.

1. (25 pts.) Consider a M-class pattern classification problem in which each pattern

X to be classified belongs to exactly one of M mutually exclusive classes ω1 ··· ωM.

Suppose that X is represented using a vector of N binary features X = [x1, x2 ··· xN ].

Let pji = P(xi = 1|ωj ). Assume that the features are independent given the class

label. Let P(ω1)··· P(ωM) be the prior probabilities of each class.

(a) Describe in words, the meaning of pji.

(b) Show that a classifier that assigns X to class ωk gk(X) ≥ gj (X)∀j ̸= k. where

gi(X) can be written in the form !N

i=1 wjixi + wj0 is a minimum error (Bayes

optimal) classifier. Express wji, in terms of pji and P(ωj ) where (1 ≤ j ≤ M;

0 ≤ i ≤ N).

(c) Generalize the above solution for (b) above to obtain minimum loss classification

where λij is the loss incurred in assigning a pattern to class ωi when it in fact

belongs to class ωj .

2. (25 pts.) Suppose you have been hired by an AI consulting firm. Indicate which

algorithm you would choose in each of the following data-driven knowledge acquisition

scenarios. In each case, briefly justify your recommendation.

(a) Your client, has a database of patient records containing symptoms and expert

diagnosis. She would like to build a diagnosis system. The attributes can be

numeric (e.g., patient’s temperature), as well as categorical (e.g., whether the

patient is pregnant). In addition to performing accurate diagnosis of patients, your

client would like to use the system to obtain insight regarding the relationships

between attributes for different diseases.

(b) Your client is designing an email organizer to be integrated into an email program.

The SPAM detector is to be trained using information that is obtained

by observing the user’s actions on each email (read and respond, mark as junk,

ignore).

(c) Your client is a large company which has managed to gather a large database of

information about past applicants to jobs in various categories (e.g., software engineer,

data scientist, etc.) and the hiring decisions on each applicant. Experts in

the organization are convinced that they can automate the process of shortlisting

applicants for further consideration. You are told that it is important that the

Calculateprobability ofXi 18咒j

pcxniilwikRF.hu

Bayesdecisionhe

decision-making process be transparent – that is, it should be easy to understand

why an applicant was shortlisted (or not).

(d) Your client is a large hospital that is interested in reducing the workload on

radiologists working on breast cancer diagnosis from radiological images. The

hospital has a large database of radiological images labeled with expert diagnosis.

The goal here is to automate diagnosis on the cases where the AI system can

produce high confidence results, and identify a subset of cases that the AI system

is not so confident about for further examination by expert radiologists.

3. (25 pts.) Recall that the perceptron learning algorithm that was described in class

is an additive weight update algorithm - that is, we add or subtract a fraction of

the misclassified sample to the weight vector at each iteration. Consider instead,

a multiplicative weight update algorithm for an n-input neuron defined as follows:

Consider a neuron defined by two weight vectors w+ and w−. Suppose both weight

vectors are initialized with a value 1 for each of their components. Consider a training

example (xp, dp) where xp ∈ {−1, 1}n is an input pattern and dp ∈ {−1, 1} is its

class label. Let yp, the output of the classifier be 1 if w+ · xp > w− · xp and yp =

−1 otherwise. Suppose the weights are updated as follows: w+

i ← w+

i β−(dp−yp)xip

and w−

i ← w−

i β(dp−yp)xip where 0 < β < 1 is a learning rate. Comment on the

potential advantages of such a multiplicative weight update algorithm over its additive

counterpart. Prove that this algorithm is guaranteed to converge to a pair of weight

vectors (w+⋆ , w−⋆ ) that correctly classify the training data whenever such weight vectors

exist. Hint: Show that the multiplicative weight update algorithm as an instance of

the standard perceptron algorithm operating on a suitably transformed weight space.

4. (a) (12.5 pts. )

Suppose you have designed (or trained) a n-input perceptron with the weight

vector W and threshold θ to correctly classify a set S of n-dimensional patterns.

Further assume that all the weight values of the perceptron so designed have

been hard-wired but the threshold θ can be set under user control. When the

perceptron is later used on a factory floor, suppose you find that the source of

the patterns (say the camera or the digitizer) adds a fixed amount of noise (given

by the n-dimensional noise vector V) to each pattern. Assuming that you can

somehow measure (or can calculate) V, how would you change θ so that the

perceptron with the same weight vector W continues to correctly classify all the

patterns in S?

(b) (12.5 pts) Briefly explain the significance of the following properties of the error

functions used to derive gradient-descent based learning algorithms of the type

considered in part (a).

i. Continuity and Differentiablilty (with respect to the weights or other parameters

of interest)

ii. Convexity with respect to weights or other parameters of interest

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp

- Data Visualisation And Analytics Assi... 2019-11-15
- Block Breaker Assignment Game Engine ... 2019-11-15
- Data Visualisation And Analytics 2019 2019-11-15
- Event Driven Computing 2019 Assignment... 2019-11-15
- Fit1043 Assignment 3 2019-11-15
- Event Driven Computing Assignment 3 - ... 2019-11-15
- 代做data Ming作业、代写systematic课程作业、代写r编程语言 2019-11-15
- Cs210留学生作业代做、Java编程语言作业调试、Java课程设计作业代写 2019-11-15
- 代写stat 385作业、代做r程序语言作业、代写r课程设计作业、Progr 2019-11-15
- 代写cpeg 222作业、Java，C/C++程序语言作业调试、Python 2019-11-15
- Ece 547作业代做、代写python编程设计作业、代做networks留 2019-11-15
- Csc8202作业代做、Web编程语言作业代写、代做web、Html课程设计 2019-11-15
- 代写mathematics课程作业、Matlab编程语言作业代做、代写mat 2019-11-15
- 代做pyopencl留学生作业、Python程序设计作业调试、Python实 2019-11-15
- Rtos Kernel作业代做、代写python，C++程序语言作业、代做j 2019-11-14
- Algorithm课程作业代写、代做r课程设计作业、R编程语言作业调试、代写 2019-11-14
- 代做fpu留学生作业、代写python，Java编程设计作业、代写c++语言 2019-11-14
- 代写msc/Icy课程作业、代写software留学生作业、代做java语言 2019-11-14
- Cse105留学生作业代做、Java程序语言作业调试、代做programmi 2019-11-14
- 代写fm 9528留学生作业、代做risk Analytics作业、Java 2019-11-14