ICS 31 Lab 9

ICS 31 • DAVID G. KAY • UC IRVINE • FALL 2017

Lab Assignment 9

This assignment is due by 10:00 p.m. on Wednesday, December 6. This will be our last lab assignment this quarter. Having it due Wednesday of the tenth week will give you an extra lab day and a weekend, plus a couple of extra days after that to go back and review what you need for the final.

Preparation (Do this part individually): Read sections 6.1, 6.2, and 6.4, doing the practice problems as you go (but you can skip problem 6.10).

Lab Work (Do this part with your partner in lab)

(a) Choose a partner for this assignment, someone you haven't worked with before, and register your partnership using the partner app, ideally by Wednesday. Make sure you know your partner's name (first and last) and contact information (Email or cellphone or whatever) in case one of you can't make it to lab.

We said this last week, but it bears repeating: Do not engage in any of the following practices; they are not acceptable and may result in reduced scores or worse: (i) working solo (this lab, like all labs in this course, is a pair programming assignment); (ii) splitting the lab with your partner ("You do (c), I'll do (d) and (e)" and just pasting the the parts together and turning them in—this is not pair programming and both partners won't learn the concepts (which of course may show up on exams or in later courses); (iii) working with someone other than your official partner. Points (ii) and (iii) could land you in academic honesty trouble, too: If you don't participate in the development of everything you turn in, you don't know whether your partner might have gotten code from an illegitimate source, and since you're responsible for code submitted by your partner, you could get into hot water for what your partner did. Too many people landed in trouble just this way in last quarter's class. You and your official partner should collaborate on your own joint work. If collaborating means you don't get quite as far as you might have gotten by one of these other impermissible strategies, your score will still be higher than if you're detected doing something impermissible.

(b) Prepare your lab9.py file as in previous labs, including a line like this:

#  Paula Programmer 11223344 and Andrew Anteater 44332211.  ICS 31 Lab sec 7.  Lab asst 9.

(c) Suppose a class takes a multiple-choice test. We're going to experiment with alternative scoring mechanisms. For this problem you'll want to say from random import * (and use the methods randrange and choice, which you can look up in the text or using help(random).)

Let's say you have these three global constants defined (a complete program might determine these values from reading a file; we're just doing it this way for convenience):

NUMBER_OF_STUDENTS = 200
NUMBER_OF_QUESTIONS = 20
NUMBER_OF_CHOICES = 4  # 3 choices is A/B/C, 4 choices is A/B/C/D, 5 is A/B/C/D/E

Use the identifiers NUMBER_OF_STUDENTS, NUMBER_OF_QUESTIONS, and NUMBER_OF_CHOICES in your code rather than the hard-coded constants 200, 20, and 4. And of course your code should work correctly when different values are assigned to these identifiers.

(c.1) Write a function called generate_answers that generates and returns a string of letters representing the correct answers to the test. (Of course answers to real tests aren't chosen randomly! We're just doing it this way to produce some test data to use when we score students' answers.) The length of the string should be the number of questions; each character in the string should be chosen randomly from the first n letters of the alphabet (where n is the number of choices). [Use the choice() method.]

Call generate_answers to generate the answers we'll use; assign the result to another global constant called ANSWERS.

(c.2) Ideally, we'd read the students and their exam answers from a file. But to save you some time, we'll skip the file-reading part and just generate a random list of students and their answers. To start with, let's say that each student is represented by a string for the student's name or ID and a string representing the student's answers to each question. [Are you thinking of a namedtuple with two fields? You should be.]

Student = namedtuple('Student', 'name answers')
s1 = Student('Jones, Jane', 'ABCCDAABAABCBBCACCAD')
s2 = Student('Smith, Sam',  'BADACCABADCCABDDCBAB')

Write the function random_students that uses the global constants above to generate and return a list of Student namedtuples. The size of the list is the number of students. The names can be randomly generated as you did in an earlier lab or, to save time, you can just generate a random ID number using randrange(). The string representing the student's answers should be generated precisely the same way as you generated the correct answers (so don't duplicate any code!).

(c.3) Modify the Student namedtuple to add two fields, one containing a list of scores on each question (1 if the student's answer matches the correct answer and 0 otherwise) and the other a number representing the sum of the list of question scores:

Student = namedtuple('Student', 'name answers scores total')
s1 = Student('Jones, Jane', 'ABCCDAABAABCBBCACCAD', [1, 0, 1, 1, 1, 0, ...], 10)
s2 = Student('Smith, Sam',  'BADACCABADCCABDDCBAB', [0, 1, 0, 0, 0, 1, ...], 5)

Then modify your random_students function to generate these student records with scores.

Generate your list of random students and then sort it by total, highest total to lowest, and print the top 10 students' names. (You can print them all; we're just trying to save paper and screen space here.) Also print the mean (average) score for all the students.

(c.4) This previous part used a conventional way to score multiple-choice exams. But you should expect the scores on this exam to be lower than on a typical real exam: On a real exam, students on the average are likelier to choose the correct answers than the wrong ones, but we generated our data completely at random. So let's think about how to generate more realistically distributed random data.

We chose each student's answer to each question above by choosing randomly from (let's say) A, B, C, and D. If the correct answer is C, we can bias the selection towards C by choosing randomly from A, B, C, D, and C—adding another copy of the correct answer to the possible choices will increase the likelihood that a random choice from that group will be the correct answer. A group of A, B, C, D, C, and C should produce the correct answer about half the time, since half the choices in the group are the correct answer. So every time we generate a student's answer to a question, we can add to the group of answer choices a few extra copies of the correct answer—for each question let's choose randomly between 0 and twice the number of choices, so that with four choices we'd add from 0 to 8 copies of the correct choice—and choose the student's answer randomly from that enhanced group of answer choices.

We can do this by defining a function called generate_weighted_student_answer that takes a string (one character, the correct answer) and returns a string (one character, the student answer chosen randomly from the enhanced group of alternatives as described above). Write a new function called random_students2 that's based on your random_students function but that generates each student's answer to each question by calling generate_weighted_student_answer.

Generate a new list of students using random_students2 and then sort it, highest total to lowest, and print the top 10 students' names. Also print the mean (average) score; it should be higher than in part (c.3).

(c.5) An unconventional way to score this exam would be to assign different weights to different questions. The instructor might assign those weights in advance, based on his or her judgement of which questions are harder or more important. But an even more unconventional way to assign the weights would be to derive them from the students' answers: The questions that were harder (i.e., that fewer people answered correctly) are worth more points than the easier ones.

One way to implement this would be to assign a number of points to each problem, that number being equal to the number of students who missed the problem. Write a function called question_weights that takes a list of Student records and returns a list of numbers, one number for each question on the test, where each number is the number of students who answered that question incorrectly. [Hint: It's helpful to attack complex data structures layer by layer. Try writing code that counts the number of wrong answers to a single question; then apply that in turn to all the questions.] Create another global constant that consists of the results of calling question_weights on your list of students from part (c.4).

Then write a function called Student_weighted_score that takes a Student record and the list of question weights and returns that Student record with its total field changed to reflect that student's score based on his or her correct answers and the corresponding question weights. Then apply Student_weighted_score to each student on your list of students from part (c.4). Finally, sort the result, highest total to lowest, and print the top 10 students' names along with the mean (average) score for all the students. These scores are likely to be much higher than the standard 0–100: In a 350-person class, a single question could be worth 350 points. We could normalize the scores by dividing each by the number of students. Do that, then re-sort and re-print the top 10 as before.

(d) Do these Python exercises:

(d.1a) Implement the function calculate_GPA that takes as input a list of strings representing letter grades and returns the grade point average (out of 4 with A=4, B=3, C=2, D=1, and F=0) computed from the list. Assume there are no plus or minus grades.

(d.1b) Implement a new function calculate_GPA2 that also computes a GPA from a list of grades. But where you (probably) used a series of if-statements the first time, this time you should use a dictionary (and no if-statements). Your dictionary should map each letter grade to the number of grade points it's worth, including plus and minus grades that are 0.3 points above and below the base letter grade.


assert calculate_GPA(['A', 'C', 'A', 'B', 'A', 'F', 'D']) == 2.5714285714285716
assert calculate_GPA2(['A', 'C', 'A', 'B', 'A', 'F', 'D']) == 2.5714285714285716

(d.2) Implement the function flatten_2D_list that takes as input a two-dimensional table (a list containing lists) and returns the input as a single list containing, in order, all the elements on the original nested list.


assert flatten_2D_list([[1, 3, 2], [3, 5, 1], [7, 5, 1], [3, 2], [9, 4]]) == 
	[1, 3, 2, 3, 5, 1, 7, 5, 1, 3, 2, 9, 4]

(d.3a) Implement the function skip_every_third_item that takes as input a list and prints out each item on the list, except that it skips every third item (so it prints L[0] and L[1], skips L[2], prints L[3] and L[4], skips L[5], and so on).


>>> L = ['If', 'you', '432234', 'did', 'the', '9834234', 'exercise', 'correctly', '534523423', 
		 'this', 'should', '1044323', 'be', 'readable']
>>> skip_every_third_item(L)
If
you
did
the
exercise
correctly
this
should
be
readable.

(d.3b) Now write skip_every_nth_item that takes as input a list and an int (call it n) and prints out each item on the list, except that it skips every nth item. Thus a call to skip_every_nth_item(L, 3) would produce the same result as skip_every_third_item(L).

(d.4) We are writing an application to process a company's weekly payroll. Every time an employee "clocks out" (leaves after a day's work), the employee's name is added to a list. Thus, at the end of a week, the name of an employee who worked five days will appear five times in this list.

(d.4a) Implement the function tally_days_worked that takes as input the list described above and returns a dictionary where every key is a name of an employee and the value is the number of days that employee worked in the given week, according to the list. (Do this by processing the list, item by item; don't use the count() method.) You may use the following list for testing your code; remember that the order of items in a dictionary is unpredictable:


work_week = ['Bob', 'Jane', 'Kyle', 'Larry', 'Brenda', 'Samantha', 'Bob', 
             'Kyle', 'Larry', 'Jane', 'Samantha', 'Jane', 'Jane', 'Kyle', 
             'Larry', 'Brenda', 'Samantha']

>>> workers = tally_days_worked(work_week)
>>> workers
{'Kyle': 3, 'Larry': 3, 'Bob': 2, 'Brenda': 2, 'Samantha': 3, 'Jane': 4}

(d.4b) We can determine how much each employee earned this week if we start with: (i) the dictionary produced by tally_days_worked(), (ii) an assumption that each employee who works, works an 8-hour day [we could relax this assumption by collecting clock-in and clock-out times for each employee each day, but we'll skip that for now], and (iii) a dictionary giving each employee's hourly rate: hourly_wages = {'Kyle': 13.50, 'Brenda': 8.50, 'Jane': 15.50, 'Bob': 30.00, 'Samantha': 8.50, 'Larry': 8.50, 'Huey': 18.00}

Implement the function pay_employees that takes as input the two dictionaries described above and prints out how much each employee will be paid, in the format shown below.


>>> pay_employees(workers, hourly_wages)
Kyle will be paid $324.00 for 24 hours of work at $13.50 per hour.
Brenda will be paid $136.00 for 16 hours of work at $8.50 per hour.
Larry will be paid $204.00 for 24 hours of work at $8.50 per hour.
Bob will be paid $480.00 for 16 hours of work at $30.00 per hour.
Samantha will be paid $204.00 for 24 hours of work at $8.50 per hour.
Jane will be paid $496.00 for 32 hours of work at $15.50 per hour.

(d.5) Implement the function reverse_dict that takes as a parameter a dictionary and returns a new dictionary with the keys and values of the original dictionary reversed:


>>> reverse_dict({'a': 'one', 'b': 'two', 'c': 'three', 'd': 'four', 'e': 'five', 'f': 'six'})
{'one': 'a', 'three': 'c', 'five': 'e', 'six': 'f', 'two': 'b', 'four': 'd'}

You may assume that both the keys and the values are unique and are immutable types (so it's possible for them to serve as either keys or values in a dictionary).

(e) Develop a program to solve the Anteater Bed and Breakfast problem. You may start with either partner's Stage I solution from the previous lab, or you may decide to start from the beginning this week.

Develop this code in its own Python file, separate from the rest of this assignment. Pay very close attention to the instructions, especially about developing the program in incremental stages.

[If you find yourself printing menus or calling input(), for example, you are doing it wrong: You did not read and understand the problem description. This is serious; this is the big time, relatively speaking. You have to follow the methodology we've been teaching all quarter long.]

And have some perspective here: This is one part of one assignment in the course. People will probably get full credit on this assignment without completing the entire B&B program. And even one slightly lower score on one lab assignment isn't likely to have a major effect on your ultimate grade in the course. But if you turn in code with bugs, or even worse, if you turn in code that was developed through impermissible collaboration, that could be a serious problem.

(f) (optional) If you have some extra time after completing the previous parts of the lab, try one or more of the following. In fact, if you're continuing to ICS 32 you might try some of these over the break to keep your skills up.

Add a menu-style user interface to ICStunes, similar to the interface we used for the restaurants program.
Add external files to ICStunes, along the same lines as we did with the restaurants program. We didn't require this or the user interface because you've done similar things before, but they're bread-and-butter everyday programming skills, so it wouldn't hurt to have more practice.
Implement a weighted scoring scheme that's less extreme than the one in (c.4): A problem is worth 1 point if over 75% of the class answer it correctly; it's worth 2 points if over 50% (but not over 75%) answer it correctly; it's worth 3 points if over 25% (but not over 50%) answer it correctly; it's worth 4 points otherwise. Score the same list of students using this scheme.
Devise some kind of visualization (perhaps a two-dimensional plot) to show how the same student scores using the three different scoring schemes. Try to produce a graphical answer to the question, Do the same students score highly under all three schemes?
Develop an interactive interface for the Anteater BandB or other enhancements discussed in the problem writeup.

(g) Remember that each partner must complete a partner evaluation form and submit it individually. Do this using the partner app. Make sure you know your partner's name, first and last, so you can evaluate the right person. Please complete your evaluation by the end of the day on Friday, or Saturday morning at the latest. It only takes a couple of minutes and not doing it hurts your participation score.

What to turn in: Submit via Checkmate your lab9.py file containing your solutions to parts (c), (d), and (f), and a separate BandBX.py file with your code from part (e). Remember what we've said in previous labs about rereading the assignment and rerunning your Python files.

Also remember that each student must complete a partner evaluation form; these evaluations contribute to your class participation score.

Written by David G. Kay in Fall 2012 for ICS 31, based in part on assignments from ICS H21 and Informatics 41. Modified by David G. Kay, Fall 2013, Winter 2014, Fall 2014, Winter 2015, Fall 2015, Spring 2017. Python exercises by David Lepe, edited by David G. Kay.

David G. Kay, kay@uci.edu
Friday, December 8, 2017 7:59 AM