ICS 32 Winter 2022
Notes and Examples: Course Introduction


Course background and goals

This course is the second in a three-quarter introductory sequence, which focuses on introducing you to computer science and to programming using Python. It builds directly on your prerequisite work — our ICS 31 course or something equivalent — and emphasizes techniques used to write larger, more complex programs than you may have written previously, each in a different, realistic problem domain. At the end of the course, you should be able to write programs much larger than you could before, and you should feel like you're empowered to work on your own programs in whatever problem domains interest you, even if you haven't learned about them in class yet. While it's not as though programming will suddenly become easy for everyone, we do intend to demystify "real-world" programming to the point where, after this course, you're able to make positive progress on realistic problems of your own choosing.


Where we've been

Before we delve too far into what's on the menu for this quarter, we should probably first take a look back at where we've been previously. If you took ICS 31, this will be a good way to quickly refresh your memory on what topics were covered; if you didn't, but you took comparable coursework elsewhere, this will help you to be sure that the knowledge you've built in that prior coursework isn't missing details that will be important for you in this course. There are a number of things that I believe you will have been exposed to, and should have gained reasonably good facility with in your prior coursework.

Information vs. operation

When you write programs, you understand that you have to build a mental model of two things: the information and the operation.

The information is the data that your program will work with. What are its inputs? What intermediate results will it remember while the program runs? What are its outputs? For the kinds of information that Python knows how to represent in a built-in way, such as integers, this is simpler; for more complex information, you'll have to understand how to take the real-world information and map it into a form that Python can use.

The operation is what your program does, which you'll generally have specified by writing your own Python functions, or by calling functions that are built into Python already. As you've seen, it becomes necessary to specify the operation very precisely: The order in which you do things matters, the structure of how you organize your program matters, and, aside from syntax errors, Python will not tell you that your program is wrong in any way other than to give you an output other than the one you expected for a given input.

Types

You understand that data has a type and that its type governs what you can and cannot do with it, as well as what you get back as a result when you do something legal.

(Throughout these notes, when demonstrating interactions with the Python shell, I'll show the shell's output using normal text, while showing what you might type using boldfaced text.)

For example, you might start with the following interaction with the Python shell, which is perfectly legal and generates a result:

>>> a = 5
>>> b = 3
>>> a - b
2

Whereas this interaction is not legal and generates an error message instead:

>>> a = 'Alex'
>>> b = 'Boo'
>>> a - b
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    a - b
TypeError: unsupported operand type(s) for -: 'str' and 'str'

These two interactions are quite similar, but they have a key difference: the types of values stored in the variables a and b. In the first case, the values both have type int, and it's legal to subtract integers from one another, in which case you get their difference as the result. In the second case, the values both have type str, and it simply doesn't make sense to subtract one string from another, so Python disallows it.

Data structures

When organizing your data, you've seen that there are a few kinds of data structures built into Python, the most common four examples of which are lists, tuples, sets, and dictionaries. You know how to choose from amongst these, and that the "shape" of the problem dictates the data structure that you choose.

Some problems are "dictionary-shaped", in the sense that you have data identified by unique keys, like student information with a student ID number associated with each one. Other problems are "list-shaped", in the sense that you have a sequence of values where the order of the elements in the sequence is considered relevant. And so on; each data structure has a "shape" and problems that have the same "shape" can make good use of that data structure.

Control structures

You've built your skills with using Python's control structures, which govern how your program flows from one part to the next. You can achieve conditionality using if statements. You can achieve repetition using both while and for loops, and you know how to recognize when you might best choose one as opposed to the other. (You might also have seen that you can achieve repetition using a different technique called recursion, but I'll reintroduce that technique in this course from first principles.)

Functions

You can write functions, which let you organize your program into small, relatively independent units of work. You know that functions accept parameters that control how they behave, and that they always return a single result — though that result might be the built-in value None if it's a function that is primarily intended to provide kind of side effect (e.g., printing output, writing to a file).

You've also seen that you can take more complex functions and break them into smaller, simpler ones. The main goal, one that we'll constantly aim for in this course, is for functions to have a single responsibility, and to be relatively short and simple; if you find yourself writing a function that's tens of lines long, there's a good chance that it should really be more than one function.

Abstraction

Most importantly, you're forming an understanding of the foremost, fundamental principle that underlies computing: abstraction. Abstraction is taking complexity and hiding it beneath a veil of simplicity, allowing you to focus on the details of how something is used, while being able to ignore the details of how it's built inside.

For example, you've probably seen something like this in Python before:

>>> '{} {}'.format('Boo', 999)
'Boo 999'
>>> '{:.2f}'.format(3)
'3.00'

You've probably never seen the actual code that implements the format method, and it's not a problem that you haven't. If you know the contract between the format method and you, the user of that method — what inputs it accepts and what outputs and effects it gives as a result — that's all you need to know. How the method is written simply isn't your concern, because your only goal is to use it.

Abstraction is what allows you to write programs that are 50,000 lines of code instead of just 50, to write programs that take months or years to complete instead of just hours or days, and to write programs together with ten other people instead of only working by yourself. It's abstraction that allows you to effectively use code written by others, even if you've never looked at it and wouldn't necessarily understand it if you did. In all of these cases, the primary benefit is that no one needs to carry every detail of the entire program in their heads at any one time; you can do that with a 50-line program you wrote yourself, but not with a 50,000 line program you wrote with ten other people. But, thanks to abstraction, you don't need to; as long as the contracts between each part of your program and the others are clear, you're in business.

In short, abstraction is the only reason we've been able to build the incredible collection of hardware and software that runs so many things in our world.


Where we're going

Now that we've agreed on where we've been, we should focus our attention on where we're going this quarter.

Software libraries

This course has kind of a peculiar-sounding title: Programming with Software Libraries. A software library is a collection of functions that are already implemented to solve a particular kind of problem. It's a set of operations (and maybe even some information). Ideally, they also include documentation that explains how to use them, e.g., what the names of the functions are, what inputs they accept, what outputs and effects they give as a result, and in what ways they might fail.

Many software libraries are provided free of charge, perhaps contingent on you following certain licensing rules (e.g., having to distribute the other party's copyright message with your program, or a prohibition on selling the program you built using the library). Others are available for purchase. But, in either case, they're about access to functionality that might otherwise be hard or costly for you to build on your own.

Even if they don't cost money, software libraries come at an intellectual price. If you want to use a library, you have to understand something about the problem domain — the kind of problem that it solves, the terminology that's used to describe aspects of that kind of problem, and the basic concepts involved in its solution. You have to be able to understand the contract between you and those libraries: what inputs you need to provide and what outputs and effects you can expect in return. Even if there's well-written documentation, you'll still have to spend time reading it, understanding it, and determining whether the library really is a good fit for solving some part of the problem you have.

Quite often, though, that price is vastly lower than the price of building the same code from scratch. You already know from your prior coursework that writing programs can be tricky. There are small-picture and big-picture details to get right. There are plenty of opportunities to make mistakes, to write code that doesn't work as you intended, to write individual pieces that work but that don't fit together the way you expected, and so on. There's a lot of value in code that's already written, especially if it's been heavily used by a wide audience, if it's been in use for a long period of time, and if it's well-documented.

So libraries can be very useful indeed. There are tradeoffs, though, when you use libraries built by others. You don't get to do everything your way. You sometimes have to adjust your thinking and your design to match what's provided by the library. You may discover that the library doesn't behave the way it's been documented, that the documentation is inadequate and doesn't tell you what you need to know, or that the code is simply buggy in places. You may find that there are vital things you need that the library doesn't provide, even if it provides some of what you need.

Python's standard library

Most programming languages include some kind of standard library, which provide a range of features that are available to any program written in that language. Python is no exception to this; it has a very large standard library, containing all kinds of interesting and useful tools:

Python's standard library provides the tools to do all of those things (and many more). We'll spend a fair amount of time in this course learning about some parts of the Python standard library, each time with a particular problem domain in mind, with a desire to solve a particular kind of problem. We'll see how to recognize which parts of a problem have solutions in the standard library, as opposed to the parts we'll have to write ourselves. And you'll build confidence that you can figure out these kinds of details on your own when you have new needs that you've never had before.

By the end of the course, I want you to be empowered; I want "real" problems to seem accessible to you. I want you to feel that you can dream up new applications that you might like to have, find libraries that are appropriate and learn how to use them effectively, even if they're things we didn't learn about in this course. After the quarter is over, I want you to be able to write something cool that you think today is beyond your reach.

That's the goal. The sky's the limit.

Extending your Python skills

As we work on solving some interesting "real-world" problems, we're also going to build new Python programming skills. There's still a lot of Python we haven't gotten to share with you yet, additional skills to develop that will make you better Python programmers — and more capable programmers, in general.

You've seen, in previous coursework, that Python programs are built around interaction with objects. Objects have types, and their types determine what operations can be performed on them. But how do we introduce new types, new kinds of objects with new sets of operations that are specific to a particular problem we're trying to solve? You do so by writing a class. We'll do plenty of that, especially in the latter half of this course, as it's the essence of one kind of programming you can do in Python (and many other languages): object-oriented programming.

Functions can fail, not just because they've been written incorrectly, but because their success depends on a factor beyond their control: getting input that's in a particular format or in a particular range, needing a hard drive to be operational, requiring a wireless network connection to be active, depending on a computer in another part of the world being up and running, and so on. In many scenarios, we should be able to anticipate and plan for cases like these, rather than just letting our programs crash as soon as something goes wrong. To account for these kinds of scenarios, we'll learn more about exceptions in Python: how to handle them when we can do something sensible about them, and how to raise them when a function we've written encounters a failure.

As the programs you write get larger, it becomes more important to consider how to organize not just individual functions, but also how to organize collections of related functions into separate modules. When a program reaches a certain length, it no longer makes sense for it to be written in one giant module (i.e., one giant .py file). We'll explore techniques for managing multiple modules, and how to decide what belongs in each one.

Another thing that becomes more difficult as programs get larger is testing. How do you know if your program is correct? How do you ensure that changes to one part of your program don't render other parts incorrect, even if they used to work fine? Of course, there are ad hoc ways to try to solve this problem — repeatedly running your program and trying out various aspects of its functionality — but this is boring, repetitive work. People don't excel at boring, repetitive work, but computers do; we should seek to automate this kind of thing. We'll explore how to do that a little later in the quarter.


A word of warning

Before we delve into the details, there are a couple of things I should warn you about.

First of all, the problems I'll be asking you to solve this quarter are quite a lot larger than the ones you solved in previous coursework. There are techniques for managing that complexity, and we'll talk about them early and often. But you'll need to be aware that there is a new set of skills, namely design skills, that you'll need to begin developing in order to succeed in this course. While we'll certainly guide you along the way, we'll gradually expect that you're able to decide on your own what components you need, how they should interact with one another, and what each one should do. As we'll see, taking the time to divide a solution into smaller parts, then to isolate those parts from one another as much as possible — so changes to one don't require a cascading set of changes to others — is paramount.

As we delve into libraries, I should warn you that there is almost no one who has every detail of every library call memorized. In fact, I'd argue that you shouldn't even want that level of memorization, as it serves little purpose. With an Internet connection, you can look up the details you need as you need them. The important thing is to have a good understanding of what you're looking for, so you can formulate a search query that will help you find it, and so you can recognize when you have found it. You'll get stumped along the way, your TAs will get stumped, your tutors will get stumped, and I will, too. Getting started early, getting questions asked when you have them, and giving yourself time to put your work aside and let yourself turn it over in the back of your mind a few times are going to be paramount.


Course organization and logistics

This course web site describes the logistical details of how this course is going to be run. Particularly, be sure that you read through the Course Reference and the front page of the Project Guide, so you will know how this course operates, and how you'll be doing and submitting your work.