首页 > > 详细

Assignment-1 All Time Olympic Games Medals

 Assignment-1

The assignment data was extracted from the Wikipedia entry on All Time Olympic Games Medals , with some minor modification to make things interesting. The dataset is split into two CSV files one of summer games and the other for the winter games and combined total. Here you can find the datasets: Summer Games and Winter Games . Use the datasets to answer the following questions:
Question 1: (based on the both datasets) 
Merge the two datasets Olympics_dataset1.csv and Olympics_dataset2.csv (do not concatenate the datasets and ignore the first row of each of the datasets). And rename columns of the result dataframe as follow (remove the rest of columns from the result dataset): 
 
Country, summer_rubbish, summer_participation, summer_gold, summer_silver, summer_bronze, summer_total, winter_participation , winter_gold, winter_silver, winter_bronze, winter_total 
 
Remove the "Totals" row from the dataframe 
Display the first five rows of the dataframe
Question 2: (based on the dataframe created in Question-1) 
Rename country to only keep the name of the country without the abbreviations (Afghanistan (AFG) --> Afghanistan) . Set the index as the country name, remove below columns and then display the first 5 rows in the Dataframe : 
 
summer_rubbish, summer_total , winter _total 
Question 3: (based on the dataframe created in Question-2) 
Remove the rows with NaN fields and display the last 10 rows .
Question 4: (based on the dataframe created in Question-3) 
Calculate and display which country has won the most gold medals in summer games? (Just print the country name) 
Question 5 : (based on the dataframe created in Question-3) 
Calculate and display which country name had the biggest difference between their summer and winter gold medal? (Just print the country name and the difference) 
Question 6: (based on the dataframe created in Question-3) 
Sort the countries in descending order, according to the number of total of medals (summer and winter) and display the first and last 5 rows of the dataframe (including a column showing the total number of medals) . 
Question 7: (based on the dataframe created in Question-3) 
Plot a bar chart of the top 10 countries ordered by the number of total of medals (summer and winter). For each country use a stacked bar chart showing for each county the total medals for winter and summer games. 
See example chart below: 
 
 
Question 8 : (based on the dataframe created in Question-3) 
Plot a bar chart of the countries (United States, Australia, Great Britain, Japan, New Zealand). For each county you need to show the gold, silver and bronze medals for winter games. See example below of the chart: 
 
 
Question 9: (based on the dataframe created in Question-3) 
Assume that there countries are ranked based on a new ranking scheme for the summer games. In the new ranking scheme, a Gold, Silver, and Bronze medals have 5, 3, and 1 points respectively. And countries are ranked based on the points eared perparticipation (total points divided by total number of participations in games). Based on this scheme, rank the countries and print the name of top 5 countries having the best rate (points per participation ). Print both country names and rates. 
 
Example: Imagine that a country has 1 gold medal, and 1 silver (5 + 3 points = totally 8 points); and this country had 10participation in the summer games. The rate of points per participation will be 8 / 10 = 0.8 for this country. (if per participation is 0, the rate should be 0; )
Question 10: (based on the dataframe created in Question-3) 
Based on the raking scheme in Question 9, also calculate the points per participation for each country in the Winter Games. Next plot a scatter chart with x = "points per participation for summer games" , and y = "points per participation for winter games". Here is an example of such a chart: 
 
however, you also need to ink bubbles based on the their continents (e.g, Asia red, Africa blue, ..). You may use the Country-Continent dataset to colour the counties, and use a default color (Gray) for countries which are not listed in the dataset. You chart must also have legends and labels showing the name of countries beside the points inside the chart.
 
Submission Guideline:
Due Date: Friday the 25th of October 2019 23:59
Submit your script named " YOUR_ZID .py" (z2123232.py) which contains your code. 
Define a function for each question separately and name them like: question_1() , question_2( ), etc. Here is an example:
import pandas as pd
... 
def question_1()
   print("----------Question 1 ---------")
   ... 
def question_2()
   print("----------Question 2 ---------")
   ...
... 
def question_10()
   print("----------Question 10 ---------")
   ...
if __name__ == "__main__": 
   question_1()
   question_2()
   ...
   question_10()
    
you can download the code template : https://raw.githubusercontent.com/mysilver/COMP9321-Data-Services/master/assignments/z1111111.py
if you do not follow this structure, you will be penalised.
 
FAQ:
Can I pass variables to functions? 
YES
Shall I skip the first row (team, summer games) in the datasets? 
YES
Can we create our own functions besides the question functions (e.g., question_1)? 
YES
Can I call another function inside the question functions? e.g., calling question_1 inside question_2 
YES
What should I do if my charts are not shown automatically? 
Look at the lab sample codes; if still need a help, ask your tutor during the tutorials.
How should I print my dataframe? 
Use: 
print(df.to_string())
Is it okay that the graph for Q8 does not pop up until the graph for Q7 is closed or should they both pop up at the same time? 
This is fine
In Question 8, does the order of countries in the plot matter? i.e. should country 1 always be United States, then Australia, Great Britain, and so on or can we plot in alphabetical order of country names. So, country 1 = Australia, country 2 = GB, etc 
No, the order is not important
Do the charts in q8 and q9 need to look the same (colors, legend position, grid) as the examples shown? or would it be fine to just use the default plotting from pandas? 
The default colours/fonts are fine
How are our submissions marked? 
They are marked manually by tutors, by running the following command: python3 z{YOUR_ZID}.py
What python packages can I use in my assignment? 
You can only use pandas and matplotlib to do the assignment.
What version of python should I use? 
Python 3+
How I can submit my assignment? 
Go to the assignment page click on the "Make Submission" tab; pick your files which must be named "YOUR_ZID.py". Make sure that the files are not empty, and submit the files together.
When is Assignment 1 due? 
the Assignment is due on Friday the 25th of October 2019 23:59
Can I submit my file after deadline? 
Yes, you can. But 20% of your assignment will be deducted as a late penalty per day. In other words, if you be late for more than 4 days, you will not be marked.
 
Plagiarism
This is an individual assignment . The work you submit must be your own work. Submission of work partially or completely derived from any other person or jointly written with any other person is not permitted. The penalties for such offence may include negative marks, automatic failure of the course and possibly other academic discipline. Assignment submissions will be examined manually.
Do not provide or show your assignment work to any other person - apart from the teaching staff of this course. If you knowingly provide or show your assignment work to another person for any reason, and work derived from it is submitted, you may be penalized, even if the work was submitted without your knowledge or consent. Pay attention that is also your duty to protect your code artifacts. if you are using any online solution to store your code artifacts (e.g., GitHub) then make sure to keep the reposiroty private and do not share access to anyone.
Reminder: Plagiarism is defined as using the words or ideas of others and presenting them as your own. UNSW and CSE treat plagiarism as academic misconduct, which means that it carries penalties as severe as being excluded from further study at UNSW. There are several on-line sources to help you understand what plagiarism is and how it is dealt with at UNSW:
Plagiarism and Academic Integrity
UNSW Plagiarism Procedure
Make sure that you read and understand these. Ignorance is not accepted as an excuse for plagiarism. In particular, you are also responsible for ensuring that your assignment files are not accessible by anyone but you by setting the correct permissions in your CSE directory and code repository, if using one (e.g., Github and similar). Note also that plagiarism includes paying or asking another person to do a piece of work for you and then submitting it as your own work.
UNSW has an ongoing commitment to fostering a culture of learning informed by academic integrity. All UNSW staff and students have a responsibility to adhere to this principle of academic integrity. Plagiarism undermines academic integrity and is not tolerated at UNSW.
 
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!