CSc 110 Assignment 9:
File I/O
Learning Outcomes:
When you have completed this assignment, you should understand and have gotten practice with:
• Reading from a file.
• Populating dictionaries/ looking up values in dictionaries
• Reading documentation
• Designing, calling and writing helper functions
Getting started
1. Download the starter file provided called assignment9.py and open it in your Wing editor
2. Download the provided input csv files to the same directory as your assignment9.py
3. We cannot upload an empty file to BrightSpace so you will need to create an empty file in the same directory
as your assignment9.py and name it EmptyFile.csv
4. Design the functions according to the specifications in the documentation provided in the starter file and using
the further explanation given in the Background Information section below.
Submission
1. Double check your file before submission to avoid a zero grade for your submission for issues described in
the Grading section of this document:
a. Open and run assignment9.py in your Wing editor using the green arrow.
You should see no errors and no output after the shell prompt.
If there are errors, you must fix them before submitting.
If there is output printed to the shell, find and remove any top-level print statements and/or top level
function calls.
b. At the shell prompt in Wing (>>>) type the following: import assignment9
You should see no errors and no output after the shell prompt.
If there are errors, you must fix them before submitting.
If there is output printed to the shell, find and remove any top-level print statements and/or top level
function calls.
c. At the shell prompt in Wing (>>>), make calls to the required functions to ensure you have named
them correctly. Ensure the function is exhibiting the expected behaviour when called and does not
contain any unexpected output. You should be able to make calls to the following functions from
your shell prompt:
i. to_date
ii. file_to_dicts
iii. get_youtube_ids
d. Ensure you have closed any files that you have opened!
2. Upload your assignment9.py containing the completed function designs to BrightSpace
Reminder: Your code is to be designed and written by only you and not to be shared with anyone else. See the
Course Outline for details explaining the policies on Academic Integrity. Submissions that violate the Academic
Integrity policy will be forwarded directly to the Computer Science Academic Integrity Committee.
Grading:
• Late submissions will be given a zero grade. • The file you submit must be named assignment9.py
The filenames must be EXACT for them to work with our grading scripts. Errors in the filenames will
result in a zero grade.
Example mistakes often made by students include but are not limited to: spelling errors, different case
letters, space characters and incorrect extension (not using .py)
• Your function names and the order of your function arguments must match EXACTLY as specified in
this document or you will be given a zero grade. Use the example tests we give you to ensure your
function header is correct.
• Your submission must not contain any print statements that are not required in the specification or any
top-level calls to functions. This unexpected code can cause the automated tester to crash and will result
in a zero grade. • We will do spot-check grading in this course. That is, all submissions are graded BUT only a subset of
your code might be graded. You will not know which portions of the code will be graded, so all of your
code must be complete and adhere to specifications to receive marks.
• Your code must run without errors with Python 3. If you tested your configuration with setup.py file this
would have verified you are using Python 3. Code that generates errors cannot be tested and will be given
a zero grade.
Marks will be awarded for correctness, considering:
• the function signature matches the description given (has the name and arguments EXACTLY as
specified)
• the function has the expected behaviour
and for code quality according to software engineering properties such as:
• documentation in docstring format: type hints, purpose, examples
• Test coverage – examples within the function docstring cover all boundary cases of all conditions within
your function
• readability
o use of whitespace
o splitting complex computation or long statements across multiple lines
• meaningful variable names o lower case, starting with an alpha-character
• proper use of constants (avoid magic numbers)
o defined above all function definitions
o in UPPERCASE
• use of code constructs (functions, variables) to: o eliminate redundant code and redundant computation
o make complex code easier to read
Background Information:
In this assignment you will be writing functions that read data from comma separated data files and creating
dictionaries to hold this data. You will then be writing functions that work with these dictionaries to answer
questions about the data.
This is the same data used in Assignment 8: Trending YouTube Video Statistics from Canada taken from
https://www.kaggle.com/datasnaek/youtube-new
Again, the downloaded data was cleaned (removed additional commas and newlines) so that you can assume the
only commas are between columns in the data and the only newlines are at the end of a row.
We have provided you with 2 input files:
One is small (SevenLines.csv) and is used in the testcases provided in the docstrings of both the
file_to_dicts and get_youtube_ids functions.
The other is very large (CAvideos.csv) and is used in the testcases provided in the docstrings of the
get_youtube_ids function.
These files are saved in what is called UTF-8 format. On some systems you will have to explicitly state this
encoding in your call to Python’s open function as follows:
file_handle = open(filename, 'r', encoding='utf-8')
Examples have been provided to allow you to test your functions and understand the problem.
Doctest has its limitations so the length of the lines do wrap around the screen making it challenging to read but
we felt it important to give you tests that would work with doctest.
NOTE: when grading we will be not be using these exact tests. That is, we will vary the inputs when grading.
TIPS for working on this assignment: • open the small file and have a look at its contents in relation to the GLOBAL CONSTANTS you are
given in the Python file
• ensure you understand the 3 dictionaries you will be populating and using in this program
• ensure you understand the type alias provided (VideoStats) • when working on the to_dicts function
o Recognize, this function takes a file handle (TextIO) and so the file is opened by the caller of the
function, therefore to_dicts should NOT be opening the file.
We have done it this way to ensure you are opening the file once and reading through it ONE
time to populate all three dictionaries.
o Work on populating one dictionary at a time, don’t try to think about all 3 at once!
• When working on the get_youtube_ids function
o Think about the steps you need to perform before writing any code
o Write the function incrementally, one step at a time
o Where the step is complex and requires multiple lines of code, consider designing and calling a
helper function. Recall how this was done in Assignment 8 for you.
o Solutions that do not use the lookup capabilities of a dictionary but instead loop through all
key:value pairs will be significantly penalized.
o Solutions that do not break the problem down and use helper functions will be significantly
penalized.
• You are free to use Python list’s sort method. Reminder: this method will mutate the list into sorted
order. If the list is a list of tuples, the sort is based on the first element in each tuple.
Function Specifications: The function specifications can be found in the documentation provided in the assignment9.py starter file.