Postgraduate coursework, Information School
1. Introduction
This part of the assessment for INF6027 Introduction to Data Science comprises a piece of individual coursework to assess your
ability to analyse data using R/RStudio and to then communicate your findings. Given a dataset (see Section 2), you should identify
a specific problem or topic you would like to investigate (e.g., where or when particular types of crime occur). You will then need to
pre-process and analyse
1
the dataset to identify patterns and relationships that address your selected problem/topic. This should
involve using techniques learned throughout the practical sessions that will help you to demonstrate your R skills, such as statistical
modelling or data visualisation, to highlight and illustrate particular aspects of the data you want to communicate (e.g., particular
patterns or trends).
You should write a 1,500 word structured report (see Section 3) that describes the approach you have taken to explore and
analyse the data for the selected problem/topic. You report should clearly communicate the results of your data analysis and be
written in a way that helps the reader interpret your findings. Charts, tables, and appendices are not included in the word count.
This assessment is worth 50% of the overall module mark for INF6027. A pass mark of 50 is required in all components to pass the
module as a whole. Submission deadline: 10am Monday 22
nd
January 2018 (Week 14) via Turnitin. See Section 4 for more
general information about Coursework Submission Requirements within the Information School.
2. The UK Police Dataset
The dataset to be used in this assessment is the UK Police Dataset. A description of the data is available here:
http://data.police.uk/about/ also including an explanation on how to download the data. The dataset describes crimes reported to UK
police during each month in different areas of the UK. Information in the dataset includes the following: geographical location
(longitude and latitude), date (month, year), LSOA code (i.e., the census area), and type of crime (e.g., vehicle crime, burglary). You
can select any data from the UK Police Dataset. (This may require multiple downloads.) You can also aggregate the dataset with
other data sources if you want (e.g., census data), although this is not mandatory as the emphasis of the coursework is on how you
carry out your analysis in R and communicate your findings.
Examples of possible analysis include, but are not restricted to, the following:
• Evolution of crimes in an area over time;
• Trends and predictions of crimes and crime rates;
• Analysis of certain types of crime (e.g., vehicle crimes);
• Comparisons of crime types in a region;
• Normalisation and integration with other datasets (e.g., LSOA census statistics);
• Focus on a certain census dimension (e.g., age of residents in the area);
• Visualisation of the data (e.g., on a map).
As mentioned previously, you should select a specific problem/topic related to the data. You should then investigate the dataset
using R and RStudio and write up your findings into a report. You should also provide your R code as an appendix. You will be
evaluated on your ability to use R and RStudio to process and analyse the data, and aid the communication of your findings for your
given problem/topic. The minimum requirement to pass is to perform. at least one type of data analysis and include at least one
visualisation (e.g., a chart or map) in the report. To obtain a higher mark and more effectively communicate your findings, you may
decide to use more than one dataset or present more than one type of data analysis and/or use multiple visualisations.
3. Report structure
You are required to produce a structured report that includes the sections detailed in Table 1. Overall, 90 marks will be awarded
based on the content of your report. In addition, 10 marks will be awarded based on the presentation of the report and how well you
communicate your findings. You must state the word count somewhere in the report (or the coversheet). As there is a word count
limit you should aim to make your writing as concise and informative as possible. Also note that your work will be assessed taking
into account the word limit; therefore, we are not expecting detailed multiple analyses in the report; rather the emphasis should be on
the clarity, accuracy and quality in communicating your findings.
Note: you may want to perform. an exploratory analysis, or play with the data to identify a particular problem/topic to focus on first. This could then be followed by
further analysis and exploration of the dataset focused on your selected problem/topic.
Maximum allocated marks
Structured abstract This should provide a summary of your report in a
structured manner. This is not included in the word
count.
Required, but 0 marks
Table of contents This should include section titles and page numbers.
This is not included in the word count.
Required, but 0 marks
Problem definition This section should briefly describe your selected
problem or topic addressed in the report and that
forms the focus for your data analysis. You should
state why you chose this problem/topic and why you
think it is an important topic to consider in this
dataset.
5 marks
Data description This section should provide a brief description of the
datasets used in your report. You should describe
the way in which you sampled/filtered the data and
why this sample is relevant to the selected problem
you previously introduced. You should list all the UK
Police datasets used (e.g., data covering different
regions or time periods). You should also list any
additional external datasets used (e.g., shape files
or census statistics for LSOA areas). Describe all
datasets used, any pre-processing and how they
were joined together (e.g., over LSOA area
identifiers).
5 marks
Chosen techniques In this section you should provide a brief description
of the techniques you used to analyse and visualise
the data. Try to justify your choices. References to
relevant literature should be provided where
appropriate.
15 marks
Results and discussion In this section you should present the results of your
data analysis and exploration (e.g., statistics, maps,
trends, predictions). You should use the results to
address the selected problem by presenting and
discussing tables and charts as appropriate. You
should present your findings in a way that helps the
reader interpret the results. You should focus on
effectively communicating the results of the analysis
to the reader by highlighting the trends or patterns
you have observed during your data analysis.
55 marks
Conclusion In this section you should summarise the main
findings of your analysis and lessons learned. You
should state the main message the reader should
come away with from your analysis.
10 marks
Appendix Include your full R code as an appendix. The code
will not be assessed.
Required, but 0 marks
4. Information School Coursework Submission Requirements
It is the student’s responsibility to ensure no aspect of their work is plagiarised or the result of other unfair means. The University’s
and Information School’s Advice on unfair means can be found in your Student Handbook,
Your assignment has a word count limit. A deduction of 3 marks will be applied for coursework that is 5% or more above or below the
word count as specified above or that does not state the word count.
It is your responsibility to ensure your coursework is correctly submitted before the deadline. It is highly recommended that you
submit well before the deadline. Coursework submitted after 10am on the stated submission date will result in a deduction of 5% of
the mark awarded for each working day after the submission date/time up to a maximum of 5 working days, where ‘working day’
includes Monday to Friday (excluding public holidays) and runs from 10am to 10am. Coursework submitted after the maximum
period will receive zero marks.
Work submitted electronically, including through Turnitin, should be reviewed to ensure it appears as you intended.
Before the submission deadline, you can submit coursework to Turnitin numerous times. Each submission will overwrite the
previous submission. Only your most recent submission will be assessed. However, after the submission deadline, the coursework
can only be submitted once.
During your first Semester at the School, when submitting a piece of work through Turnitin, you will only be able to view a ‘similarity
report’ when submitting your Test Essay. You can then edit and resubmit your Test Essay. For other coursework you will not be able
to view a Turnitin ‘similarity report’. Details about the submission of work via Turnitin can be found at: http://youtu.be/C_wO9vHHheo
If you encounter any problems during the electronic submission of your coursework, you should immediately contact the module
coordinator and one of the Information School Exams Secretaries (Julie Priestley, , 0114 2222839 or
Larah Hogg, , 0114 2222640). This does not negate your responsibilities to submit your coursework on
time and correctly.