首页 > > 详细

讲解留学生Python语言、统计分析程序讲解、辅导Sling Media Data Analysis Chalenge

Sling Media Data Analysis Chalenge

This analysis challenge is to lok at education-related data from various countries and analyze how that
relates to the performance of participants from those countries in the International Mathematics Olympiad.
Your Task
Perform. data analysis, modeling and visualization required to answer the folowing questions:
1. Attempt predicting the mean IMO rank of a country using the education statistics given.
Are you geting a meaningful, generalizable fit and a reasonable acuracy?

2. Instead of predicting the mean rank values directly try predicting the range of ranks.
E.g. whether the mean rank of a country is les than 10, in betwen 10 and 30, and so on.
How are the godnes of fit and the accuracy measures when you predict these levels/ranges?

3. Does the prediction performance change when you change the spliting points of the mean rank variable
used to create the output levels? Is there some principled way to arrive at an optimal split?

4. Which variables have a strong asociation with the IMO performance of a country?
From a quick lok, do these asociations sem to have real world significance or causal relationships?
Data
Here are the data sets included in the zip file data.zip:
1. edustats.csv
This is education statistics of countries adapted from the data publicly available from World Bank.
Each indicator from the original data is averaged over the years 200 to 2014.
Columns:
• Country.Name - Comonly used name of the country
• Indicator.Code - Education indicator code in the World Bank data
• Value - 15-year average value for the indicator
2. edustats_indicators.csv
This is a maping from the indicator code in edustats.csv to its description.
Columns:
• Indicator.Code - Education indicator code in the World Bank data
• Indicator.Name - Short description of the indicator
3. imo.csv
This file contains the mean rank of each country in the Mathematics Olympiad. IMO ranks were
obtained form. https:/ww.imo-official.org/results.aspx, and the mean rank for each country for the
period 2010 to 2017 were calculated.
Columns:
• Country.Name - Name of the country
• Mean.Rank - 8-year mean rank of the country
Additional Instructions

• The analysis neds to be performed using either R or Python. Please share your work as a single
notebook (html export) that includes al code, outputs and descriptions.

• Include detailed descriptions of your observations and conclusions. We would like the descriptive
parts of the solution to reflect your critical thinking, understanding of the limitations of the
methods, awareness of the pitfals involved, and possible extensions beyond what you atempted.

• This analysis chalenge may take considerable amount of time and efort to dig dep into the data used
for this problem. We do not expect you to spend more than 4 or 5 hours on this.
Within this time, most people can do only a quick first pas on the data.
To manage your time, you may chose to stop the analysis steps at appropriate points and
describe further directions one could explore.

• When you ned to take a decision, and the instructions here don’t clearly specify how, feel fre
to use your judgement. Just state any asumption you are making.

How to Submit?
Send your solution as an email atachment to and
, with the subject “Sling Media Data Analysis Chalenge”.
Remember to mention your ful name.

Hope you wil enjoy this exploration.
Good luck!

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!