辅导program编程设计、讲解c+,java，Python程序语言讲解留学生Processing|辅导R语言编程

In this project, I want us to combine several discussions to appreciate the advancements and challenges of machine learning, to practice scientific investigations, and to write scientific reports. We will do this in a setting where we combine pretrained recognition networks with a reinforcement learning model implemented as function approximator.
The task is to implement and evaluate a reinforcement learner that takes handwritten numbers as input and counts either up or down depending on the best action to choose. The reward in this environment is r=1 for state 0, and r=2 for state 9. There is no reward given in the states between, and the rewards are not provided to the learner in advance, hence it is a model free RL task.
We started already in Assignment 8 to use a recognition network to first map an image of a number to the corresponding integer or one-hot representation. Such a solution should be included as baseline in your investigations. The task of the final is now to implement a single network that takes an MNIST image as input and provides the value function Q for counting up or down.
There should be two stages to this investigation above the baselines of Assignment 8. The first is to use a MNIST recognition network that you pretrain that then becomes part of the RL network. You should then test the performance with frozen and unfrozen layers of the pretrained network. You should be able to use all test and training data provided in the MNIST dataset. However, please include arguments in case you must reduce this set.
The second stage is to try and train the network without pretraining. I want to caution you that this is not an easy task and that you might not even be able to get this performing well, if at all. Some discussion of this most challenging part of the project must be included in your paper.
The report of this final project will be in form of a scientific paper such as a typical scientific paper for a conference proceeding with a strict page limit of 4 pages. The paper is to be submitted in unzipped pdf format. The font size must be 11pt or larger, including all the fonts used in illustrations. A margin of 1 inch is expected around all edges. I am fully aware that reporting all your findings in this space will require some careful writing. You can assume that the readership has some machine learning background, though some brief introduction of your method is still required. Also note that readers must be able to reimplement your experiments from the paper alone. Hence, all parameters of the model and the experiments must be provided. Please also submit your programs.
This final project is meant to be an interactive research project, so I expect you to reach out to your study group, to TAs, or to the instructor if you have questions or you run into problems in your implementation. Although this is an individual project, I fully expect that you will be discussing this with your peers. However, you need to write your own paper and you must be prepared to defend your paper in the end.

辅导program编程设计、讲解c+,java，Python程序语言 讲解留学生Processing|辅导R语言编程

辅导program编程设计、讲解c+,java，Python程序语言讲解留学生Processing|辅导R语言编程