讲解FIT3080Java、Java设计辅导留学生、解析Java、Java程序讲解

FIT 3080, Assignment 2 Part 2

Page 1 from 5

FIT 3080: Assignment 2 Part 2 (15%)
MDP RL, Machine Learning

1. [MDP] Implementation of the Grid World [4*10 Marks]. Consider the Grid
World environment that we saw in the class, shown below as well (se section
17.1 of the textbok for more detailed explanation). Let the discount parameter
to be 0.5 in all the following questions.

(a) Implement the value iteration algorithm for the Grid World. Your program
should be able to get the value of R (namely the instantaneous reward, equal for
all states) from the user, and then run the value iteration algorithm. Afterwards,
it should print out the optimal values of al states as wel as the optimal policy
into a file.
Your program must be executable by the folowing command-line:
java -jar question1.jar a
or
python question1.py a

where is the value of R and is the name of the
output file into which the results are printed. For example the folowing
command-line:
java -jar question1.jar a 1.2 my_output.txt
or
python question1.py a 1.2 my_output.txt

runs the program with R=1.2 and writes the output into a file named
“my_output.txt”.
Each line in the output file must have the folowing format:
optimal_value optimal_action
where the first string is the optimal value and the second string is the optimal
action (one of U for up, D for down, L for left, R for right) and the two strings are
separated by one ‘space’. There must be 9 lines in the file, each of which
FIT 3080, Assignment 2 Part 2

Page 2 from 5
corresponding to a state in the grid world (note that the goal states and the
shaded block state do not apear in the output file). The lines 1-9 in the output
file correspond respectively to the states with the following coordinate in the
grid world: (1,1), (1,2), (1,3), (1,4), (2,1),(2,3),(3,1),(3,2),(3,3).
(b) Implement the Q-value iteration algorithm for the Grid World. Your program
should be able to get the value of R (namely the instantaneous reward, equal for
all states) from the user, and then run the Q-value iteration algorithm.
Afterwards, it should print out the optimal Q-values of al state-action pairs as
wel as the optimal policy into the output file.
Your program must be executable by the folowing command-line:
java -jar question1.jar b
or
python question1.py b

where is the value of R and is the name of the
output file into which the results are printed. There must be 9 lines in the output
file coresponding to the 9 states (in the order mentioned in the part (a) ). The
format of each line in the output file is as folows:
Qvalue_U Qvalue_D Qvalue_L QvalueR optimal_action
Where the first four numbers are the Q values coresponding to the 4 actions,
and the last string is the optimal action, i.e. one of U, D, L, or R.
(c) Implement policy iteration algorithm for the Grid World. Your program
should be able to get the value of R (namely the instantaneous reward, equal for
all states) from the user, and then run the policy iteration algorithm. Afterwards,
it should print out the optimal policy into to the output file.
Your program must be executable by the folowing command-line:
java -jar question1.jar c
or
python question1.py b

where is the value of R and is the name of the
output file into which the results are printed. There must be 9 lines in the output
file corresponding to the 9 states (in the order mentioned in the part (a) ). The
format of each line in the output file is as folows:
optimal_action
where the optimal_action is one of U, D, L, or R.
(d) Based on your implementation, which one of the algorithms you
implemented in parts (a), (b), and (c) converges faster? Briefly explain.

FIT 3080, Assignment 2 Part 2

Page 3 from 5
2. [Machine Learning] Classification [42 Marks]
Consider the dataset tic-tac-toe.arf available on moodle. Each example in this
dataset represents a diferent game of tic-tac-toe
(htp:/en.wikipedia.org/wiki/Tic-tac-toe), where the player writing crosses (“x”)
has the first move. Only those games that don’t end in a draw are included, with
the positive class representing the case where the first player wins and the
negative class the case where the first player loses. The features encode the
status of the game at the end, so each square contains a cros “x”, a nought “o” or
a blank “b”.
(a) Before you run the classifiers, use the weka visualization tool to analyze
the data. (2 + 2 = 4 marks)
(i) Which atributes sem to be the most predictive of wining or losing?
(hint: if you were the “x” player, where would you put your first cross and
why?)
(ii) What can you infer about the advantage (or otherwise) of being the first
player?
(b) Run J48 (=decision tree) and Naive Bayes to learn a model that predicts
whether the “x” player wil win. Perform. 10-fold cross validation, and analyze
the results obtained by these algorithms as folows.
(i) J48 (=decision tre) (2 + 3 + 14 + 3 = 2 marks)
x. Examine the decision tre and indicate the main variables.
y. Trace the decision tre for the folowing game. What would it predict?

z. What is the first split in the decision tre? Calculate (by hand) the
Information Gain obtained from the first split in the tre. Show your
calculations.
FIT 3080, Assignment 2 Part 2

Page 4 from 5
t. What is the accuracy of the decision tre? Explain the results in the
confusion matrix. �
(ii) Naive Bayes (6 + 2 + 2 = 10 marks)
x. Calculate (by hand) the predicted probability of a win for the following
game. Show your calculations.

y. What is the probability that a player with this configuration will win?
What would the Naive Bayes clasifier predict for this game?
z. What is the accuracy of the Naive Bayes classifier? Explain the results in
the confusion matrix.
(c) (4 + 2 = 6 marks) Draw a table to compare the performance of J48 and Naive
Bayes using the summary measures produced by weka. Which algorithm does
beter? Explain in terms of weka’s summary measures. Can you speculate why?

FIT 3080, Assignment 2 Part 2

Page 5 from 5
Submission Requirements

• Remember that it is an individual asignment; you are not allowed to do
it in pairs.

• Submit on Moodle a zip file that contains (1) A PDF file containing your
solutions to the questions, and (ii) jar or python files containing your
implementation for Question 1, and your WEKA scripts for Q2. The zip file
should be named A2P2_StudentID.zip, where StudentID is your Student ID
number. Assignments that don’t folow this naming convention wil
be rejected.

• Hand in a hard copy of your report into the asignment box in the foyer of
building 63, 25 Exhibition Walk.

• You can only use Java or Python as the programing language.
Please use the skeleton code that we have provided in these languages for
you on Modle for this asignment.

• Your Python/Java file must be executable on lab computers by the
command-line mentioned in the questions. If your program does not run
smothly on a computer lab with the comand-line mentioned in the
questions, then you may get zero mark for that implementation question.
We emphasize that you need to use exactly the name “question1.py” (if
you use Python) or “question1.jar” (if you use Java) for your program.

• Late submission policy: 10% of the maximum mark wil be deducted for
every day a submision is late.