Part 1: Reading and Writing Files

The Story So Far

You are back teaching at Hogwarts, and the old LMS has finally kicked the bucket. During the years-long search for a new one, instructors like you are left in the lurch. You can still grade student work - the magical computers can't tell the precise pronunciation differences in leviosa - but you and your peers are hopeless at adding up the grades.

You are going to use the same collection of data from way-back-when (you are between semesters - no summer classes at Hogwarts!). However, now we are going to parse the lot of it.

Your goal today:

  • To find all the .txt files within a provided directory.
  • To compute all the student grades, dropping the two lowest quizzes and homeworks - rounding to two decimal places (e.g. 97.85) from those files. Do not drop any exams.
  • To assign those grades a letter according to the standard Purdue grading scheme, and a 40% Exams/30% HW/30% Quiz system.
  • To write the computed grades to a csv file whose name is the course id plus csv, within a provided output directory.
    1. The first line should be the class name.
    2. The second line gives column names: student_id,letter_grade,number_grade
    3. The remainder of the document should include accurately parsed data.

Let's get started.

Finding the Files

Step one is finding all the .txt files.

We have already briefly discussed os. Import it into your project, and read the help. Right now, you need a function that can list the files in a given directory.

Once you find one that works, try it out on the studentdata folder.

We need to ignore students.csv. There are tools in os for checking extension, but for something this simple, you can filter - for example, with a list comprehension - for strings that end with a given value. A great function for that is in help(str).

Now that we have the name of a file in studentdata, we can get a path to that file with os.path.join('studentdata',filename).

Open a file, read it to a string, and play around.

Parsing The File

Take your example file, and read it with the csv module. You should now have a list of lists of strings.

The first string is the class name - which we'll need later when saving the file.

The second line is the csv headers - a line of csv telling us what the rest of the file should be.

After that come the data lines.

Parsing The Data

Looking at just one line in one file, figure out how to get all the homework, quiz, and exam grades for a student as separate lists of integers. (figure out the maximum grade on HWs, Quizzes, and Exams by looking around.) Using the .sort list method, figure out how to drop the lowest $n$ of each.

Write a function which converts one of these data lines into the numeric grade - with 40% of the grade being exams, 30% being homework, and 30% quizzes.

Now you can convert to letter grades. Remember the grading scale:

GPA   Letter Grade   Percentage
4.33 A+ 97%-100%
4.0 A 93%-96%
3.6 A- 90%-92%
3.3 B+ 87%-89%
3.0 B 83%-86%
2.6 B- 80%-82%
2.3 C+ 77%-79%
2.0 C 73%-76%
1.6 C- 70%-72%
1.3 D+ 67%-69%
1.0 D 63%-66%
0.6 D- 60%-62%
0 F 0%-59%

Write a function which converts a data line to the corresponding list with the user id, letter grade, and percentage grade rounded to two decimal places. An excellent tool for rounding is the round function.

Creating Directories

Next, we need an output directory - default /grades/. You can find the command to make one in os - if we create using os.makedirs('grades', exist_ok=True), we can create the directory if it doesn't exist yet, throw an error if we can't create it, but pass cleanly if it exists.

Then open a file for writing in the target directory. Convert the input file extension to .csv - we are, after all, making comma separated values files - and write the described lines to the file.

Look over it and make sure it is OK.

Finishing Up

Now that you have the tools you need to convert one file, you can:

Write a function which takes a given input file in the given gradebook format, an output directory - and writes the appropriate output file of tabulated grades. Return True if the process completes successfully.

Don't worry for the moment about creating tests for this function - file i/o is relatively tricky to test. So long as your component functions are well tested, you can continue.

Lastly,

Write a function CalculateGrades(input_directory,output_directory) which takes a given input folder path and output folder path, and calculates grades for all the .txt files as .csv files in the output. Return True if the process completes successfully.

You know the drill:

Upload a module containing your documented functions, with no other executed code, to Brightspace.

Testing File I/O with unittest.mock You can create tests that create or destroy files, but that is rarely a good idea - it can have unintended behaviors and side effects.

Instead, you can create fake files with the unittest.mock testing utility. This utility allows for the faking of a variety of system behaviors, allowing you to safely test things that would otherwise create strange side-effects.

Read the documentation for a fuller picture, but basically, unittest.mock provides context managers, within whose context functions of your choice behave differently.

This example - adapted from the docs - covers the basics of testing file writes:

from unittest.mock import patch, mock_open

m = mock_open()
with patch('__main__.open', m):
    with open('foo', 'w') as h:
        h.write('some stuff')
m.assert_called_once_with('foo','w')
handle = m()
handle.write.assert_called_once_with('some stuff')

Part 2: Persistent Data - stopping and resuming computation

The Story So Far

You are a mathematician, and you are searching for a number.

You have a (black box) function check_number which checks for the target number:

from math import gcd
from time import sleep
def check_number(n):
    time.sleep(60) # Represents time-consuming computations
    return n != 1 and gcd(n,2305567963945518424753102147331756070) == 1

You could check all the integers with a simple check:

n=2
while True:
    if check_number(n):
        break
print("Found n!")

Which would work for small values, but this is a long computation which you will run over multiple sessions.

We also want to keep track of how long the computation is running.

Define a function, search_integers(check_function, filename, bound), which:

  • accepts a check function as an object parameter
  • Iterates over all positive integers less than a reasonable bound - use a default of $10^5$
  • When it checks a number, it appends to a provided file path - creating it if it doesn't exist - with default value checked_numbers.csv. Be careful to only have the file open when you are using it - not during computation, and to not erase any previous lines.
    • Lines in this file should be csv with, in order:
      • the checked n
      • binary value True or False indicating whether it was successful
      • A timestamp of the completed time.
  • If called, and the target file exists, reads the last line of the file and starts computation at that point instead.

You know the drill:

Upload your .py module implementing this function to Brightspace.

Department of Mathematics, Purdue University
150 N. University Street, West Lafayette, IN 47907-2067
Phone: (765) 494-1901 - FAX: (765) 494-0548
Contact the Webmaster for technical and content concerns about this webpage.
Copyright© 2018, Purdue University, all rights reserved.
West Lafayette, IN 47907 USA, 765-494-4600
An equal access/equal opportunity university
Accessibility issues? Contact the Web Editor (webeditor@math.purdue.edu).