Introduction
:
INSTNM STABBR ACTCMMID PCIP14 MD_EARN_WNE_P10 GRAD_DEBT_MDN_SUPP C150_4_POOLED_SUPP
Alabama A M University AL 18 0.1% $29,900.00 $35,000.00 0.3%
University of Alabama at Birmingham AL 25 0.1% $40,200.00 $21,500.00 0.6%
Amridge University AL 0.0% $40,100.00 $23,000.00 PrivacySuppressed
University of Alabama in Huntsville AL 27 0.3% $45,600.00 $23,500.00 0.5%
Alabama State University AL 18 0.0% $26,700.00 $32,091.00 0.3%
The University of Alabama AL 27 0.1% $42,700.00 $23,750.00 0.7%
Central Alabama Community College AL 0.0% $27,200.00 $9,388.00
Athens State University AL 0.0% $38,500.00 $18,534.00
Auburn University at Montgomery AL 21 0.0% $33,500.00 $22,192.50 0.2%
Requirement
CSE231 Spring 2018
Project 11: Classes
This assignment is worth 55 points (5.5% of the course grade) and must be completed and turned
in before 11:59pm on 4/23/2018.
Overview
In this assignment you will practice classes and file handling.
Background
In this project, you design two classes to help with reading and writing CSV files. A CSV file is a
comma-separated values file that is used to store data. We provide College_Scorecard.csv which
includes data from 1996 through 2016 for all undergraduate degree-granting institutions of higher
education. The data about the institution will help the students to make decision about the
institution for their higher education such as student completion, debt and repayment, earnings,
and more.
Project Specifications
There are two classes: a CsvWorker class that represents a whole csv spreadsheet, and a Cell class
that represents one cell in that spreadsheet. That is, the CsvWorker class contains a bunch of Cell
objects.
Your solution must include the following classes with their methods:
1. Cell(object): This class defines a cell component of a CSV and has four attributes:
a reference to a CsvWorker instance, a column number, a value, and its alignment for
printing. These 4 attributes should be private. The CsvWorker instance describes that csv
spreadsheet that this cell is in. You should implement the following methods:
a. init(self, csv_worker, value, column, alignment):
This method initializes the object and its 4 private attributes. The default value of
the parameters should be None, empty string, 0 and ‘ ^ ‘ , respectively.
b. str(self): This method builds and returns a formatted string (to be used
for printing the value). To build the string we will need to find the cell width from
the csv worker instance by calling the worker’s get_width method and passing
this cell’s column number attribute (The CsvWorker stores the widths for each
column which is explained in the next section.) See Notes below for information
on how to use width and alignment as parameters in formatting a string. Some
values may need to be formatted specially (Hint: skip the special formatting for
your first version). For example, a column of fractions could be formatted as a
percentage. Special functions are also stored in CsvWorker for the formatting of
each column. If a function is defined for the column of a given cell, str should
call that function to get a pre-formatted string. The function should not crash when
trying to format a title cell or an empty cell that cannot be converted to a floating
point, it should return the original value unmodified.
c. repr(self): This method should return a shell representation of a Cell
object. No need to do anything special, just call str and return that value.
d. set_align(self, align): This method takes in an alignment string,
left/right/center. (left: ‘ ‘ ) This method redefines the
cell’s alignment attribute from the parameter.
e. get_align(self): This method returns the object’s alignment.
f. set_value(self, value): This method takes in a new value and redefines
its value attribute.
g. get_value(self): This method returns the object’s value.
2. CsvWorker(object): This class is the actual CSV Worker class that will take a csv
file as input and process it, then stores its data for output.
a. init(self, fp=None): This initializes the worker class. The object
can be instantiated with a file pointer to process it, or a file can be processed after
instantiating the object—in both cases using the read_file method described
next. The attributes of this class are columns, rows, data, widths, and special
functions assigned to format each column. Columns and rows are just integer values
representing the size of the csv table. The data attribute will be a list of lists
containing all the data in the CSV. How you organize it is up to you, but a format
of list of rows is recommended, such that data[row][column] will give an
individual cell. The attribute widths is a list of integers defining the formatted
width of each column. The attribute special is a list of functions defining the
special formatting functions for each column. Initialize ints to zero and lists to be
empty.
b. read_file(self, fp): This method takes a file object, fp, which is for a
CSV file and iterates over it, filling in all the attributes along the way. The data
should store the data in all the cells. If the value in the file is “Null”, the value in
the cell should be the empty string. The widths should store the width of the
widest element in each column. Determine rows and columns values from
the number of rows and columns in the file. Note that all row in the csv file may
not have the same number of values, i.e. not the same number of columns. Be sure
that you account for this. (Hint: for your first version assume that all rows have the
same number of columns and add logic to account for different columns later.) All
values in special should be set to None. Be sure to close your file once you’re
done with it.
c. getitem(self, index): This method overloads the [] operator to
allow you to access values in data. We provide it for you.
d. setitem(self, index, value): This method overloads the []
operator to allow you to set values in data. We provide it for you.
e. str(self): Overload the str method to allow the class to convert
its data to a string (so it can be called by print). Just convert the cells in each row
in data to strings, append a carriage return ( ‘ \n ‘ ) to each row, and return the
resulting string. Note that for an item in data, str(item) will call
str for that cell so you get a formatted string for that cell.
f. repr(self): This method should return a shell representation of a
CsvWorker object. No need to do anything special, just call str and return
that value.
g. limited_str(self, limit): This method does a similar job to
str, but it should print maximum of limit number of lines. CSV files can
get quite large, so this allows you to see just a small portion of your data.
h. remove_row(self, index): This method allows you to remove a row from
the data at the particular index. (Hint: you can pop a whole row.)
i. set_width(self, index, width): This method sets width of the column
of data at index. The width information of every column is stored in the
widths attribute so you can simply modify the value at that particular index.
j. get_width(self, index): This method returns information about the width
of any column by specifying its index in widths.
k. set_special(self, column, special): This method allows you to
link a special formatting function to the column. In this project, there are two
special functions: percentage and currency that convert values to those particular
formats. We provide this method.
l. get_special(self, column): This function returns the special formatting
function assigned to a column. We provide this method.
m. set_alignment(self, column, align): This method sets the
alignment of the specified column by providing a string value such as
left/right/center. This method calls the set_align(alignment) method of the
Cell object (note that you need to set the alignment for every cell in the specified
column). If the alignment is not one of the 3 valid alignments ‘’, this method
should raise TypeError.
n. get_columns(self): This method returns number of the columns in the
CsvWorker object (i.e. returns that attribute).
o. get_rows(self): This method returns number of the rows in the CsvWorker
object (i.e. returns that attribute).
p. minimize_table(self, columns_list): This function returns a new
CsvWorker object which is a minimized version of the original CsvWorker. The
new object only contains the columns specified in the columns_list parameter
(which is a list of colum indices). Hint: create a new CsvWorker instance, and for
each row append new cell instances whose values are initialized with values from
the original CsvWorker instance.
q. write_csv(self, filename, limit = None): This method writes the
data into a CSV file named by the filename parameter. Only the values are
written, i.e. no formatting. The limit parameter optionally limits the number of
rows to be written in the CSV file. Remember to close the file.
r. write_table(self, filename, limit = None): This method writes
the data into a tabular formatted text file named filename. Number of the rows
to be written is controlled by limit parameter. Hint: for this method you can
simply open a file, write using the limited_str method described above, and
close the file.
s. minimum(self, column): This method returns the cell with minimum value
of a column . You can only find the minimum of cells that are numbers so skip
any values that are not numbers. Hint: try to convert the value to a float and if a
ValueError is raised, simply continue to the next cell.
t. maximum(self, column): This method returns the cell with maximum value
of a column. You can only find the maximum of cells that are numbers so skip
any values that are not numbers. Hint: try to convert the value to a float and if a
ValueError is raised, simply continue to the next cell.
You need to implement of the following functions:
1. open_file(): This function prompts the user to enter a filename. The program will try
to open a CSV file. An error message should be printed if the file cannot be opened. This
function will loop until it receives proper input and successfully opens the file. It returns a
file pointer.
2. percentage(value): The value parameter is a float. This method will convert the
value to string and convert it into percentage format with one decimal place. Ex: 3.443 to
3.4%. If value is not a float, simply return the value unchanged (Hint: try to convert
value to a float; if you get a ValueError, simply return the value.) Hint: there is a
percentage string format.
3. currency(value): The value parameter is a float. This method will convert value to
a string and convert it into currency format with two decimal places. Ex: 3.443 to $3.44.
If value is not a float, simply return the value unchanged (Hint: try to convert value to a
float; if you get a ValueError, simply return the value.)
4. main(): We provide this function. There are several tasks to complete for the main
implementation of the function.
a. Instantiate a CsvWorker with the input file and minimize it to contain columns
INSTNM, STABBR, ACTMMID, PCIP14, MD_EARN_WNE_P10,
GRAD_DEBT_MDN_SUPP, and C150_4_POOLED_SUPP. These are the
column numbers: 3, 5, 40, 55, 116, 118, and 122 (zero indexed). The new
CsvWorker instantiated will have 7 columns.
b. Percentage columns should be formatted as percentages, and currency columns
should be formatted as currency (USD).
c. After minimizing and formatting, the data should be written to “output.txt” and
“output.csv” using appropriate methods.
d. Here are the titles for each column, to clarify the data:
INSTNM Institution name
STABBR State code
ACTMMID Median ACT composite score
PCIP14 Percentage of graduates receiving engineering degrees
MD_EARN_WNE_P10 Median earnings 10 years after entry
GRAD_DEBT_MDN_SUPP Median debt after graduation
C150_4_POOLED_SUPP Completion rate
Notes:
In the Strings section we provided a link to a string formatting summary site which has a
section on named placeholders (). That shows
how you can use parameters for width and alignment. For example. If some_value = 4 ,
AAA = ‘>’ and WWW = 10 , you can print a formatted string with parameterized alignment
and width. Try this in the Python shell.
S = “{:{align}{width}}”.format(some_value, align=AAA, width=WWW)
print(S)
Deliverables:
The deliverable for this assignment is the following file:
proj11.py – the source code for your Python classes and functions.
Be sure to use the specified file name and to submit it for grading via Mimir before the project
deadline.
Grading Rubrics
General Requirements:
0 (5 pts) Following all the coding standards
Implementation:
0 (3 pts) open_file function (no Mimir test)
0 (3 pts) percentage function
0 (3 pts) currency function
0 (10 pts) Testing Cell class
0 (15 pts) Testing CsvWorker class
0 (8 pts) Test 1
0 (8 pts) Test 2 (hidden test)
:
import csv
class Cell(object):
def init(self, csv_worker = None, value = ‘’, column = 0, alignment = ‘^’):
self.csv_worker = csv_worker
self.value = value
self.column = column
self.alignment = alignment
def str(self):
return ‘{:{align}{width}}’.format(self.value,align=self.alignment,width=self.column)
def repr(self):
return self.str()
def set_align(self, align):
self.alignment = align
def get_align(self):
return self.alignment
def set_value(self, value):
self.value = value
def get_value(self):
return self.value
class CsvWorker(object):
def init(self, fp = None):
self.data = []
self.widths = []
self.special = []
self.rows = 0
self.columns = 0
if fp:
self.read_file(fp)
def read_file(self, fp):
reader = csv.reader(fp)
row_no = 0
for row in reader:
self.data.append([])
column_size = len(row)
if row_no == 0:
self.widths = [0 for x in range(column_size)]
self.special = [None for x in range(column_size)]
column_idx = 0
for column in row:
if column == ‘NULL’:
column = ‘’
width = len(column)
if width > self.widths[column_idx]:
self.widths[column_idx] = width
cell = Cell(self, column)
self.data[row_no].append(cell)
column_idx += 1
row_no += 1
fp.close()
self.rows = row_no
self.columns = len(self.widths)
def getitem(self, index):
return self.data[index]
def setitem(self, index, value):
self.data[index] = value
def str(self):
line = ‘’
for row in self.data:
for column in row:
line += str(column)
line += “\n”
return line
def repr(self):
return self.str()
def limited_str(self, limit):
line = ‘’
count = 0
for row in self.data:
for column in row:
line += str(column)
line += “\n”
count += 1
if count >= limit:
break
return line
def remove_row(self, index):
self.data.pop(index)
def set_width(self, index, width):
self.widths[index] = width
for row in self.data:
row[index].column = width
def get_width(self, index):
return self.widths[index]
def set_special(self, column, special):
self.special[column] = special
def get_special(self, column):
return self.special[column]
def set_alignment(self, column, align):
if align in [‘’]:
for row in self.data:
row[column].set_align(align)
else:
raise TypeError()
def get_columns(self):
return self.columns
def get_rows(self):
return self.columns
def minimize_table(self, columns):
_instance = CsvWorker()
for rowidx in range(0, self.rows):
values = []
for column_idx in columns:
values.append(self.data[rowidx][column_idx])
_instance.data.append(values)
_instance.rows = self.rows
_instance.columns = len(columns)
_instance.special = [self.special[x] for x in columns]
_instance.widths = [self.widths[x] for x in columns]
return _instance
def write_csv(self, filename, limit = None):
fp = open(filename,’w’)
writer = csv.writer(fp)
row_no = 0
for row in self.data:
row_no += 1
values = []
for column in row:
values.append(column.value)
writer.writerow(values)
if limit and row_no == limit:
break
fp.close()
def write_table(self, filename, limit = None):
fp = open(filename, ‘w’)
fp.write(self.limited_str(limit))
fp.close()
def minimum(self, column):
min_value = None
for row_idx in range(0, self.rows):
try:
f_value = float(self.data[row_idx][column].value)
except ValueError:
continue
if min_value:
if f_value max_value:
max_value = f_value
else:
max_value = f_value
return max_value
def open_file():
#”Input a file name: “
#open(filename, encoding=”utf-8”)
#”File not found. Try again”
return open(‘college_scorecard.csv’,’r’)
def percentage(value):
try:
return ‘{.1f}%’.format(float(value))
except ValueError:
return value
def currency(value):
try:
return ‘${.2f}’.format(float(value))
except ValueError:
return value
def main():
fp = open_file()
master = CsvWorker(fp)
csv_worker = master.minimize_table([3,5,40,55,116,118,122])
csv_worker.set_special(3, percentage)
csv_worker.set_special(6, percentage)
csv_worker.set_special(4, currency)
csv_worker.set_special(5, currency)
for i in range(len(csv_worker[0])):
csv_worker.set_width(i, csv_worker.get_width(i) + 4)
csv_worker.write_table(“output.txt”,10)
csv_worker.write_csv(“output.csv”, 10)
max_act = csv_worker.maximum(2)
min_act = csv_worker.minimum(2)
max_earn = csv_worker.maximum(4)
min_earn = csv_worker.minimum(4)
print(“Maximum ACT:”, str(max_act).strip())
print(“Minimum ACT:”, str(min_act).strip())
print(“Maximum Earnings:”, str(max_earn).strip())
print(“Minimum Earnings:”, str(min_earn).strip())
if name == “main__”:
main()