首页 >
> 详细

STAT 3675Q - STATISTICAL COMPUTING UCONN

Fall 2019 Marcos Prates

1. Objective

The IMDB Movies Dataset (file imdb.csv) contains information about over 10,000 movies.

The names of the first twelve columns are self-explanatory (the duration is in seconds). The

rest of the variables (Action, Adult, Adventure, . . .) are dummy variables (0/1) indicating

if the movie has the given genre.

In this project, you will apply a number of statistical methods that have been covered during

the course using R.

• Projects are to be completed individually, or with someone.

• The project is worth 25% of the final grade.

Directions. You are asked to write a preliminary report and a report. Please follow carefully

the following guidelines

2. Preliminary report [30 points]

• Provide a single file with the format name_3675_prelim.pdf (or name1_name2_3675_prelim.pdf

if you work with someone), where name is your full name.

• The preliminary report is due on Sunday, November 22, 2019 at 11:59 PM. Submit it

via HuskyCT. The pdf must be generated using Rmarkdown.

Your preliminary report must contain the following elements.

(a) A preliminary exploratory analysis including summary statistics and basic graphs (4

pages max)

(b) Pose scientific questions that are interesting to you and indicate what statistical methods

may help answer those questions (1 page)

(c) Include the R code and all outputs.

3. Report [70 points]

For the report, provide a single file with the format name_3675_report.pdf (or

name1_name2_3675_report.pdf), where name is your full name. The pdf must be

generated using Rmarkdown.

• The report must be at least 10 pages long, without exceeding 30 pages (including the

code and the graphs).

• The report is due on Sunday, December 8, 2019 at 11:59 PM. Submit it via HuskyCT.

(a) Include the preliminary report

(b) Include at least one regression method

(c) Include at least one ANOVA analysis

(d) Include at least one classification method

For each method,

• Express all statistical models using mathematical formulae, and clearly state the meaning

of the notations, and the assumptions.

• Insert R code and necessary comments. Your output must contain the R code (do not

use the echo=FALSE option).

• Interpret extensively all outputs and graphs that you include.

4. Important dates

• November 22, 2019: Preliminary report is due

• Decebmer 8, 2019: Report is due

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- Tsp课程作业代写、代做algorithms留学生作业、代做java，C/C 2020-06-23
- Kit107留学生作业代做、C++编程语言作业调试、Data课程作业代写、代 2020-06-23
- Sta302h1f作业代做、代写r课程设计作业、代写r编程语言作业、代做da 2020-06-22
- 代写seng 474作业、代做data Mining作业、Python，Ja 2020-06-22
- Cmpsci 187 Binary Search Trees 2020-06-21
- Comp226 Assignment 2: Strategy 2020-06-21
- Math 504 Homework 12 2020-06-21
- Math4007 Assessed Coursework 2 2020-06-21
- Optimization In Machine Learning Assig... 2020-06-21
- Homework 1 – Math 104B 2020-06-20
- Comp1000 Unix And C Programming 2020-06-20
- General Specifications Use Python In T... 2020-06-20
- Comp-206 Mini Assignment 6 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Mech 203 – End-Of-Semester Project 2020-06-20
- Ms980 Business Analytics 2020-06-20
- Cs952 Database And Web Systems Develop... 2020-06-20
- Homework 4 Using Data From The China H... 2020-06-20
- Assignment 1 Build A Shopping Cart 2020-06-20