辅导 COMP 202 - Foundations of Programming Assignment 4: The McGill Career Fair讲解留学生Python程序

COMP 202 - Foundations of Programming

Assignment 4: The McGill Career Fair

Fall 2025

Due: December 5, 2025 at 11:59 pm on Ed Lessons

Late Penalty: 10% per day and up to 2 late days

November 2025

Important Notice

Make sure that all file names and function names are spelled exactly as described in this document. Otherwise, a 50% penalty per question will be applied. You may make as many submissions as you like prior to the deadline, but we will only grade your final submission (all prior ones are automatically deleted). The following instructions are important:

• Please read the entire assignment guidelines and this PDF before starting. You must do this assignment individually.

• The work submitted for this assessment is expected to be your own. The use of technologies such as ChatGPT is prohibited and will be considered a violation of the Code of Student Conduct.

• You must respect the prescribed return types. For example, do not return a list if the instructions say to return None. Otherwise, no credit will be awarded for that question.

• Do not use functions that we didn’t mention in class.

• Do not use break or continue statements.

Directions:

To get full marks, you must follow all directions below:

• Make sure that all file names and function names are spelled exactly as described in this document. Otherwise, a 50% penalty per question will be applied.

• Make sure that your code runs without errors. Code with errors will receive a very low mark.

• Write your name and student ID in a comment at the top of your program.

• Name your variables appropriately. The purpose of each variable should be obvious from the name.

• Comment your code. A comment every line is not needed, but there should be enough comments to fully understand your program.

• Where possible, declare constants as global variables. Any fixed numeric value (int or float) that can be named should be a global constant. Additionally, avoid making strings global.

• Avoid writing repetitive code, but rather call helper functions! You are welcome to add additional functions if you think this can increase the readability of your code.

• Lines of code should NOT require the TA to scroll horizontally to read the whole thing.

• Vertical spacing is also important when writing code. Separate each block of code (also within a function) with an empty line.

• Up to 30% can be removed for bad indentation of your code, omission of comments, and/or poor coding style. (as discussed in class).

• You can lose up to 30% for not respecting the following requirements:

– Calling functions outside the code (it is ok to call a function inside another function but not outside)

– Not following functions names

– Using material not seen in class or using break or continue

Hints & tips

• Start early. Programming projects always take more time than you estimate!

• Do not wait until the last minute to submit your code. Submit early and often—a good rule of thumb is to submit every time you finish writing and testing a function.

• Write your code incrementally. Don’t try to write everything at once. That never works well. Start off with something small and make sure that it works, then add to it gradually, making sure that it works every step of the way.

• Read these instructions and make sure you understand them thoroughly before you start. Ask questions if anything is unclear!

• Seek help when you get stuck! Check our discussion board first to see if your question has already been asked and answered. Ask your question on the discussion board if it hasn’t been asked already. Talk to your TA during office hours if you are having difficulties with programming. Go to an instructor’s office hours if you need extra help with understanding a part of the course content.

At the same time, beware not to post anything on the discussion board that might give away any part of your solution—this would constitute plagiarism, and the consequences would be unpleasant for everyone involved. If you cannot think of a way to ask your question without giving away part of your solution, then please drop by our office hours.

• If you come to see us in office hours, please do not ask “Here is my program. What’s wrong with it?” We expect you to at least make an effort to start to debug your own code, a skill which you are meant to learn as part of this course. And as you will discover for yourself, reading through someone else’s code is a difficult process—we just don’t have the time to read through and understand even a fraction of everyone’s code in detail.

However, if you show us the work that you’ve done to narrow down the problem to a specific section of the code, why you think it doesn’t work, and what you’ve tried to fix it, it will be much easier to provide you with the specific help you require and we will be happy to do so.

Learning Objectives

Upon successful completion of this assignment, you will have demonstrated your ability to:

• Model Real-World Concepts: Represent entities like employees, jobs, and companies using Python classes, implementing essential methods like _ _init_ _and _ _str_ _ to manage their state and behavior.

• Process and Normalize Text: Implement robust functions for string parsing and normalization, including splitting text on delimiters, trimming whitespace, and standardizing case to prepare raw data for analysis.

• Implement Data Structures: Confidently use lists and nested lists to store, traverse, filter, and aggregate collections of data, such as skills, vocabularies, and job histories.

• Translate Data for Analysis: Convert qualitative data (like lists of skills) into quantitative repre-sentations (numeric count vectors) to enable mathematical comparisons.

• Apply Mathematical Concepts in Code: Implement numerical algorithms from scratch using loops, including the dot product, Euclidean norm, and cosine similarity to measure relationships be-tween vectors.

• Practice Modular Design: Decompose a larger problem into smaller, reusable functions and classes, organizing them into separate modules (.py files) to create a clean and maintainable codebase.

• Write Resilient Code: Build a robust CSV parser that can handle file I/O and gracefully manage errors in data formatting without crashing.

• Develop Good Programming Habits: Write clean, readable, and well-documented code that follows established style. guidelines for naming, indentation, and commenting.

Recommended Order

To make this assignment more manageable, we strongly recommend you build the components in the following order. This approach is incremental, allowing you to test each piece of your system as you build it.

1. The Foundation (utils.py): Start by implementing all the functions in utils.py. These are self-contained, general-purpose tools that you will need for almost every other part of the assignment. Completing them first will give you a solid foundation and a set of reliable building blocks.

2. The Data Models (employee.py and job.py): Next, create the Employee and Job classes. These classes represent the core entities in our career fair. They are relatively simple and will depend on the helper functions you wrote in utils.py. You can test them individually to make sure they store and format data correctly.

3. The Data Loader (network.py): With your Job class complete, you can now write the CSV parser in network.py. This function will be responsible for reading the jobs.csv file and creating a list of Job objects, bridging the gap between raw data and your program’s object model.

4. The Orchestrator (company.py): Finally, implement the Company class. This is the most complex piece, as it brings everything together. It will manage lists of Employee and Job objects and use the methods and functions you’ve already built to perform. complex operations like calculating similarities and making hiring decisions.

Introduction

Welcome to the McGill Career Fair. The hall is busy with students in line, recruiters answering questions, and screens showing open roles. Each booth asks visitors to scan in, list a few skills, and try short prompts that show what they can do. Your job is to keep the matching clear, fair, and easy to explain. With a small toolkit you will clean dash separated text into tokens, build a shared vocabulary, turn text into aligned count vectors, and compare those vectors with cosine similarity. You will also set up a tiny data model for Employee, Job, and Company, load a few CSVs, compute match scores, and help booth leads make consistent end of day decisions. Every step should be auditable: the same inputs produce the same outputs, and each score can be traced back to the tokens and vectors that created it.

1 Utilities first: utils.py [20 points]

Create a file named utils.py. Unless otherwise stated, your functions must not print anything. Return the requested values and round only when the specification says so. The utilities mimic the small backstage scripts used at each booth to clean lists and build vectors before any hiring decision is made.

1.1 get list [5 points]

Inputs. s (string). A dash separated string of items.

Output. List of strings. Items lowercased, trimmed, and with empty items removed.

This function turns a dash separated string into a normalized token list used throughout the assignment. It splits on ’-’, strips surrounding spaces from each piece (hereafter referred to as a token), converts tokens to lowercase, and discards any empty entries created by repeated dashes or whitespace only parts.

>>> get_list(’password management-IT maintenance- computer repair’)

[’password management’, ’it maintenance’, ’computer repair’]

>>> get_list(’ python---SQL - etl ’)

[’python’, ’sql’, ’etl’]

1.2 make vocabulary [5 points]

Inputs. items (list of list of strings). Each inner list is a token list.

Output. vocab (list of strings). unique tokens in lowercase.

This function builds a shared alphabet of tokens for vectorization across multiple lists. It collects all tokens from the inner lists, removes duplicates, and returns the final vocabulary in lowercase. If the input list is empty return an empty list.

>>> make_vocabulary([[’Python’,’SQL’],[’python’,’etl’, ’sql’]])

[’python’, ’sql’, ’etl’]

>>> make_vocabulary([[], [’a’, ’a’], [’b’]])

[’a’, ’b’]

>>> make_vocabulary([])

[]

1.3 vectorize [5 points]

Inputs. tokens (list of strings), vocab (list of strings).

Output. counts (list of integers). Same length as vocab, where each entry is the frequency of the corre-sponding token in tokens.

This function converts a token list into a bag of words count vector aligned with the given vocabulary. For each position in vocab, count how many times that vocabulary token appears in tokens and write that number at the same index in the output list. Tokens that are not in vocab are ignored, and vocabulary items that do not appear receive a zero.

Concrete example. Suppose the vocabulary is [’etl’, ’excel’, ’python’, ’sql’] and the token list is [’sql’, ’python’, ’sql’, ’ETL’, ’python’, ’python’]. We align counts by the vocabulary order:

The resulting count vector is [1, 0, 3, 2] because etl appears once, excel does not appear, python appears three times, and sql appears twice. Note: the count is case insensitive so a ’ETL’ token should match the vocabulary ’etl’. The vocabulary is guaranteed to be all lowercase.

>>> vocab = [’etl’,’python’,’sql’]

>>> vectorize([’python’,’etl’,’python’], vocab)

[1, 2, 0]

>>> vectorize([’pyThon’,’ETl’,’pandas’,’sql’,’sql’], vocab)

[1, 1, 2]

>>> vectorize([], vocab)

[0, 0, 0]

1.4 cosine similarity [5 points]

Inputs. v1 (list of floats), v2 (list of floats).

Output. sim (float). Cosine similarity rounded to two decimals. Return 0.0 if either vector is all zeros.

Definitions. Let a = [a1, . . . , an] and b = [b1, . . . , bn] be lists of the same length n.

This function measures how aligned two vectors are, independent of scale. You are required to implement the computations using loops (no external modules except math.sqrt). Raise a ValueError if the vectors have different lengths. If either vector has a 0 Euclidean norm, returns 0.0. Otherwise returns the dot product divided by the product of Euclidean norms, rounded to two decimals.

Concrete example. Let v1 = [1, 2, 0] and v2 = [0, 2, 2]. First compute the dot product: 1·0+ 2·2+ 0·2 = 4.

Next compute the norms with sums of squares: and

Then compute the cosine:

and round to two decimals to return 0.63.

>>> cosine_similarity([1, 2, 0], [0, 2, 1])

0.8

>>> cosine_similarity([1, 2, 0, 4], [0, 2, 1])

Traceback (most recent call last):

...

ValueError: v1 and v2 have different lengths

>>> cosine_similarity([0,0], [5,7])

0.0

>>> cosine_similarity([3,4], [3,4])

1.0

2 The data model: employee.py, job.py, company.py [70 points]

Write each class in its own file so that responsibilities are clear, imports are explicit, and testing is modular.

File layout and imports

Write each class in a separate file:

• employee.py contains class Employee.

• job.py contains class Job.

• company.py contains class Company.

2.1 Class Employee (in employee.py) [15 points]

An Employee represents a candidate walking the fair with a resume and a list of skills they type into a kiosk.

2.1.1 Attributes.

• name (str).

• age (int).

• gender (str) in {’m’,’f’,’x’}.

• education (str) in {"none","high school","bachelors","masters","phd"}.

• prev jobs (str). A dash separated string of prior titles.

• skills (str). A dash separated string of skills.

• cur job (str or None).

2.1.2 Required methods

2.1.2.1 _ _init_ _ [5 points]

Inputs. name (string), age (int), gender (string in {’m’,’f’,’x’} case insensitive), education (string in {"none","high school","bachelors","masters","phd"} case insensitive), prev jobs (string, dash separated titles), skills (string, dash separated skills), cur job (string or None, optional).

Output. None.

This constructor validates and normalizes a candidate profile so that downstream code can rely on consistent values. It checks that age lies in [0, 120], verifies that gender (case insensitive) is one of ’m’, ’f’, or ’x’, and ensures that education (case insensitive) is one of the five allowed strings. It raises ValueError (with an appropriate message) when any of these constraints are violated, then sets the instance attributes. Note: You do not need to validate the type of inputs, assume they will be correct. Additionally, the order of the input parameters has to be the same as they are listed above (in inputs).

2.1.2.2 _ _str_ _ [5 points]

Inputs. self (Employee).

Output. String (str).

This method returns a human readable multi line summary of the employee that prints normalized fields and lists. You must use the get list() function from utils to convert the dash seperated strings to list.

>>> c1 = Employee(’James Bond’, 35, ’M’, ’high school’, ’Spy’,

... ’Martial artist-Knife mastery-spy-driver-card player’)

>>> print(c1)

Name: James Bond

Age: 35

Gender: m

Highest education: high school

Skills: [’martial artist’, ’knife mastery’, ’spy’, ’driver’, ’card player’]

Previous jobs: [’spy’]

Current job: None

2.1.2.3 add prev job [5 points]

Inputs. self (Employee), job (string).

Output. None.

This method appends a new past job title to the employee’s dash separated history in lowercase. If the provided job is not a string, it raises a TypeError with the exact message ’Error in Class Employee: The input job should be a string’.

>>> # Continuing the example from the previous method

>>> c1.add_prev_job(’Security Analyst’)

>>> print(c1)

Name: James Bond

Age: 35

Gender: m

Highest education: high school

Skills: [’martial artist’, ’knife mastery’, ’spy’, ’driver’, ’card player’]

Previous jobs: [’spy’, ’security analyst’]

Current job: None

2.2 Class Job (in job.py) [10 points]

A Job is a posting at a booth with a short description and an internal reference used on leaderboards.

Attributes.

• nb jobs (class int) starts at 0.

• ref (int) assigned as Job.nb jobs after increment.

• title (str).

• keywords (str) interpreted as dash separated skills required for the job.

• salary (float).

• employee (Employee or None).

2.2.1 Required methods

2.2.1.1 _ _init_ _ [5 points]

Inputs. title (string), keywords (string, dash separated skills), salary (float), employee (Employee or None, optional).

Output. None.

The constructor of the Job class takes five inputs to initiate the attributes of the class in that order: the title (string), keywords (string), salary (float), and an optional employee (Employee). The constructor will also increase the class attribute value nb jobs by 1 and then associate that new value to the instance attribute ref. If no Employee is given as input, the value of the employee attribute is None. The constructor raises a ValueError if the input salary is a negative number. Note: You do not need to validate the type of inputs, assume they will be correct. Additionally, the order of the input parameters has to be the same as they are listed above (in inputs).

2.2.1.2 _ _str_ _ [5 points]

Inputs. self (Job).

Output. String (str).

This method returns a string for the object of the class Job.

>>> description = ’optimize - fraud - detection - software - poker’

>>> job1 = Job(’Fraud Analytics Manager’, description, 120000)

>>> print(job1)

Reference: 1

Title: Fraud Analytics Manager

Keywords: [’optimize’, ’fraud’, ’detection’, ’software’, ’poker’]

Salary: 120000

Employee: None

>>> empl = Employee(’James’, 0, ’M’, False, ’high school’, ’Spy’, ’Martial artist’)

>>> job2 = Job(’Poker Player’, description, 5000, empl)

>>> print(job2)

Reference: 2

Title: Poker Player

Keywords: [’optimize’, ’fraud’, ’detection’, ’software’, ’poker’]

Salary: 5000

Employee: James

2.3 Class Company (in company.py) [45 points]

Note. We recommend completing Section 3 before implementing this class. The jobs.csv used in the examples below is the same file listed in Section 3.

A Company owns a booth and keeps track of people hired during the fair and roles that are still open.

Attributes.

• name (str).

• location (str).

• employees (list[Employee]).

• job csv (str).

• job (list[Job]) assigned after parsing job csv using create job list from network.py.

2.3.1 Required methods

2.3.1.1 _ _init_ _ [5 points]

Inputs. name (string), location (string), employees (list of Employee), jobs csv (string or None, optional).

Output. None.

This constructor creates a company, stores its location and initial employees, and optionally loads open jobs from a CSV by calling create job list from network.py. The loaded jobs should be stored in an instance attribute (self.jobs) as a list. Note: You do not need to validate the type of inputs, assume they will be correct. Additionally, the order of the input parameters has to be the same as they are listed above (in inputs).

>>> comp = Company("nova cafe", "toronto", [], "jobs.csv")

2.3.1.2 _ _str_ _ [5 points]

Inputs. self (Company).

Output. String (str).

This method returns a multi line summary that includes the company name, location, and counts of employees and open jobs for quick booth status boards.

>>> comp = Company("Happy Tails Academy", "Montreal", [], "jobs.csv")

>>> print(comp)

Name: Happy Tails Academy

Location: Montreal

Number of employees: 0

Number of available jobs: 3

2.3.1.3 skills_similarity [7 points]

Inputs. self (Company), job (Job), employee (Employee).

Output. skill_sim (float).

Tokenize (convert to list) the employee’s skills and the job’s keywords as lowercase items with empty tokens removed. Build a shared vocabulary as the sorted union of both token sets. Turn each skill list (employee and job) into a count vector whose i-th entry is how many times the i-th vocabulary token appears on that side, then take the cosine between these two count vectors.

Example. If the employee’s skills are python-sql-etl-sql and the job description is sql-excel-dashboard, the cleaned tokens are [python, sql, etl, sql] and [sql, excel, dashboard]. The shared vocabulary is V = [dashboard, etl, excel, python, sql]. The aligned vectors are xemp = [0, 1, 0, 1, 2] and xjob = [1, 0, 1, 0, 1]. The similarity is the cosine of these two vectors, here 2/√6 √3 ≈ 0.47.

>>> comp = Company("Happy Tails Academy", "Montreal", [], "jobs.csv")

>>> job = comp.jobs[1]

>>> empl = Employee("James", 0, "M", "high school", "Spy", "pet-trainer")

>>> comp.jobs[1].employee = empl

>>> empl2 = Employee("Martin", 0, "M", "bachelors", "vet", "pet-veterenarian")

>>> print(comp.skills_similarity(job, empl2))

0.35

2.3.1.4 education similarity [8 points]

Inputs. self (Company), employee (Employee).

Output. education sim (float).

Use a one-hot domain (see note below): none, high school, bachelors, masters, phd. Represent the employee as a one-hot vector in this domain. Represent the company as a vector of counts of pre-existing employees over the same five levels. To reflect “at least the same education,” set to zero all entries in the company’s profile that are below the employee’s level, then take the cosine between the two vectors.

Example. If the employee’s level is masters then the employee vector is [0, 0, 0, 1, 0], and assuming the company’s employee education counts are [1, 4, 6, 3, 1] for [none, high school, bachelors, masters, phd], then the adjusted company vector is [0, 0, 0, 3, 1]. The similarity can therefore be computed as 3/√1√32+12 = 3/√10 ≈ 0.95.

>>> comp = Company("Happy Tails Academy", "Montreal", [], "jobs.csv")

>>> job = comp.jobs[0]

>>> empl = Employee("James", 0, "M", "high school", "Spy", "pet-trainer")

>>> comp.jobs[1].employee = empl

>>> empl = Employee("Jamy", 45, "M", "phd", "Spy", "pet-trainer")

>>> comp.jobs[0].employee = empl

>>> empl2 = Employee("Martin", 0, "M", "high school", "vet", "pet-veterenarian")

>>> print(comp.education_similarity(empl2))

0.71

One-hot vectors (quick note). A one-hot vector encodes a single category from a fixed, ordered domain as a vector of zeros with exactly one entry equal to 1. Its length equals the size of the domain, and the position of the 1 is the index of the category in that domain. Keep the domain order fixed and identical wherever you compare vectors.

Examples.

• Domain [m, f, x]: category f → [0, 1, 0].

• Domain [none, high school, bachelors, masters, phd]: category bachelors → [0, 0, 1, 0, 0].

2.3.1.5 estimate hire success [3 points]

Inputs. self (Company), job (Job), employee (Employee).

Output. score (float).

If the job already has an assigned employee, return 0.0. Otherwise, use the previous two methods to compute sskills and seducation Combine them using a weighted average (shown below) and return the result rounded to two decimals.

score = 0.8 × sskills + 0.2 × seducation.

>>> comp = Company("Happy Tails Academy", "Montreal", [], "jobs.csv")

>>> job = comp.jobs[1]

>>> empl = Employee("James", 0, "M", "high school", "Spy", "pet-trainer")

>>> comp.jobs[0].employee = empl

>>> empl2 = Employee("Martin", 0, "M", "bachelors", "vet", "pet-veterenarian")

>>> print(comp.estimate_hire_success(job, empl2))

0.28

2.3.1.6 hire [12 points]

Inputs. self (Company), job candidates ({int: list[Employee]}).

Output. {int: (Employee or None)}.

This method hires the best eligible candidate for each job reference provided in job candidates. The input is a dictionary mapping a job reference (int) to a list of candidate Employee objects. Jobs are processed in the insertion order of the dictionary so that results are deterministic (see note below). For each job reference, the method searches self.jobs to locate the matching Job. If the job reference is unknown, the result for that key is None. If the job already has an assigned employee, the result for that key is the existing employee and no changes are made. Otherwise, it scans the provided candidates, ignores any who already hold a current job, computes estimate hire success(job, candidate) for each, and keeps the strictly highest score with ties broken by the candidate’s input order for that job. If a best candidate exists, it sets the candidate’s cur job to the job title, assigns job’s employee to that candidate, appends the candidate to self.employees, and stores that candidate in the result for the job reference (key). A candidate hired for one job during this call becomes ineligible for any later jobs in the same call because cur job is updated as soon as they are hired. The function returns a dictionary with the same set of job references as the input where each value is the hired Employee or None if no hire was made.

Note. As for Python 3.7 dictionaries are ordered by default (you are using Python 3.10 in Thonny), so just iterating over them using a for loop would give you the entries in the original order.

>>> comp = Company("Happy Tails Academy", "Montreal", [], "jobs.csv")

>>> # Grab job references (by CSV order)

>>> r_train, r_recep, r_handle = (

... comp.jobs[0].ref,

... comp.jobs[1].ref,

... comp.jobs[2].ref,

... )

>>> # Competition per job

>>> e_tr1 = Employee(

... "sam", 25, "f", "bachelors",

... "trainer-writer", "entry-level-dog-trainer",

... )

>>> e_tr2 = Employee(

... "mike", 28, "m", "masters", "trainer-vet", "sql-excel",

... )

>>> e_rec1 = Employee(

... "alex", 22, "m", "high school",

... "receptionist-Secretary", "front-desk--pet-clinic",

... )

>>> e_rec2 = Employee(

... "taylor", 23, "f", "bachelors",

... "receptionist", "scheduling-front-desk",

... )

>>> e_hand1 = Employee(

... "jordan", 21, "m", "high school",

... "writer", "daycare-playroom-handler",

... )

>>> e_hand2 = Employee(

... "casey", 27, "f", "bachelors",

... "handler-bouncer", "kennel-maintenance",

... )

>>> result = comp.hire({

... r_train: [e_tr2, e_tr1],

... r_recep: [e_rec2, e_rec1],

... r_handle: [e_hand2, e_hand1],

... })

>>> print(result[r_train])

Name: sam

Age: 25

Gender: f

Highest education: bachelors

Skills: [’entry’, ’level’, ’dog’, ’trainer’]

Previous jobs: [’trainer’, ’writer’]

Current job: trainer

>>> print(result[r_recep])

Name: alex

Age: 22

Gender: m

Highest education: high school

Skills: [’front’, ’desk’, ’pet’, ’clinic’]

Previous jobs: [’receptionist’, ’secretary’]

Current job: receptionist

>>> print(result[r_handle])

Name: jordan

Age: 21

Gender: m

Highest education: high school

Skills: [’daycare’, ’playroom’, ’handler’]

Previous jobs: [’writer’]

Current job: handler

2.3.1.7 fire [5 points]

Inputs. self (Company), employee (Employee).

Output. None.

This method removes the given employee from their current job at the company. It first checks that the employee is currently employed by this company. If not, it raises an AssertionError with a message stating that the employee is not part of the organization. On success, it clears the employee attribute for the job, clears the cur job attribute for employee, and removes the employee from internal list of employees.

>>> # Assume the same setup and hires as in the previous example

>>> # Fire the receptionist and verify the vacancy

>>> fired = comp.fire(job_ref=r_recep)

>>> comp.fire(e_hand1)

>>> print(comp.jobs[2].employee)

None

>>> # Try to fire the other unhired receptionist

>>> comp.fire(e_hand2)

Traceback (most recent call last):

...

AssertionError: Employee isn’t a part of the organization

3 CSV loader: network.py [10 points]

Create a file named network.py that contains the module level CSV loader(s) used by the other modules. These functions do not print. They return data structures that downstream code consumes.

3.1 create job list [10 points]

Inputs. csv filename (string) [Path to a CSV with rows title, keywords, salary].

Output. list[Job].

This function loads open roles for a booth from a CSV and returns a list of Job objects with employee=None. It reads the file line by line and splits each line on commas. Only rows with exactly three fields [title, keywords, salary] are considered. Rows with fewer or more than three comma-separated values are ig-nored. For each valid row, the function attempts to convert salary to a float and construct Job(title, keywords, float(salary)). If conversion or object construction raises an exception, the exception is caught, an error message is printed, and that row is skipped. The returned list preserves the order of suc-cessfully parsed rows.

jobs.csv:

trainer,entry-level-dog-trainer,43000

receptionist,front-desk--pet-clinic,38000

handler,daycare- playroom-handler,36000

jobs test.csv:

trainer,entry-level-dog-trainer,43000

receptionist,front-desk--pet-clinic,38000

receptionist

receptionist,front-desk--pet-clinic,forty-thousand

handler,daycare- playroom-handler,36000

>>> jobs = create_job_list("jobs.csv")

>>> print(jobs[2])

Reference: 3

Title: handler

Keywords: [’daycare’, ’playroom’, ’handler’]

Salary: 36000.0

>>> jobs_test = create_job_list("jobs_test.csv")

>>> print(len(jobs_test))

Exception caught: could not convert string to float: ’forty-thousand’

4 Reflection [Bonus 5 points]

Having an algorithm (the one you just wrote or even a Large Language Model, such as ChatGPT!) decide which candidates to hire can save a lot of time in the hiring process and has some practical benefits. However, there are also some ethical issues in letting an algorithm decide who is the best candidate to hire for a job.

1. Play around with the examples we gave you for this assignment. Do you notice anything interesting about the candidates that are hired when you call the hire method on different lists of candidates?

2. In the method used for evaluating educational similarity, what biases are encouraged? and how can we address them?

3. Do you have any suggestions or modifications you can make to the overall evaluation method to address these issues? You do not need to write any code for this. Just think about what you could do differently in the code if you were to redesign these classes. What does your suggestion improve and which parts does it make worse?

Note: Write 2-3 sentences per point explaining your perspective.

联系我们