首页 > > 详细

辅导 Data analytics and visualisation讲解 Python语言

1. Assessment structure

The individual project accounts for 80% of the module grade. Please choose one data set in the list below for vour nroiect The length of the renort should he of 3000 words excluding references. All the relevant literature. and resources for your project should be properly cited in the Harvard referencing style.

Marks allocated to criteria:   Criteria

15%

1. Introduction to data and research question (~1000 words)

Please introduce the data set used and its background. The relevant literature (e.g., academic journal articles and textbooks) should be surveyed and properly cited with Harvard referencing style. More importantly, please identify a problem to be addressed with this data set (i.e., the research question). Please note that the problem should be specific (i.e., relevant in the application domain and linked to the variables available from the data set).

15%

2. Data processing and exploration (~500 words)

Please explain: Which variables are available from the data set? Which variables have been selected for the analysis and why? Any data transformations have been done and why?

20%

3. Data visualisation and interpretation (~800 words)

Please provide at least three data visualisations as descriptive analytical results (e.g., properties of the variables selected) and advanced analytical results (e.g., relationships between the variables selected, machine learning results). Please follow best practices taught in the module regarding data visualization. Importantly, please interpret the results and findings with details. Note that the data visualisations should be nontrivial representations of information, yet easy to interpret.

15%

4. Data insights and conclusions (~700 words)

Please provide the insights drawn from the analvtics and summarise the findings. In particular, is the problem (i.e, research question) identified at the beginning addressed by the analytics? How?

15%

5. Oral Presentation (3 minutes)

Please present your work in 3 minutes. The schedule will be announced later.

2. Recommended data sets

NOTICE: Before starting the individual project, vou will need to confirm vour choice of data set on Moodle. It will bel available from 28 October 2025. Any submissions without data choice confirmation will have the marks reduced accordingly.

While the use of generative Al technologies such as ChatGPT could be helpful for learning Python programming, it is important to note that employing it to produce any portion of a written report is strictly prohibited. Please find a list of recommended data sets below, which can be downloaded from https://amazon-reviews2023.github.io/ or directly loaded to Colab from Hugging Face. All of them have significant textual content (i.e., Amazon reviews). Therefore, text analytics tools should be employed. Please note that each dataset includes reviews (ratings, text, helpfulness votes) as well as product metadata (descriptions, category information, price, brand, and image features), which are linked by product parent_asin number.

● All_Beauty

● Amazon_Fashion

● Appliances

● Arts_Crafts_and_Sewing

● Automotive

● Baby_Products

● Beauty_and_Personal_Care

● Books

● CDs_and_Vinyl

● Cell_Phones_and_Accessories

● Clothing_Shoes_and_Jewelry

● Digital_Music

● Electronics

● Gift_Cards

● Grocery_and_Gourmet_Food

● Handmade_Products

● Health_and_Household

● Health_and_Personal_Care

● Home_and_Kitchen

● Industrial_and_Scientific

● Kindle_Store

● Movies_and_TV

● Musical_Instruments

● Office_Products

● Patio_Lawn_and_Garden

● Pet_Supplies

● Software

● Sports_and_Outdoors

● Tools_and_Home_Improvement

● Toys_and_Games

● Video_Games

● Others (please email to confirm)



联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!