ICS 45C Spring 2022, Notes and Examples: Contracts and Exceptions

Background

When we write functions or classes in C++, there are two things we need to think about: the actual code we're writing and the assumptions that underlie that code. Some programming languages allow or even require us to explicitly state more of these assumptions in our code than others. For example, C++ requires us to specify types in a lot of places and checks them at compile time, while Python lets us leave them out and fails at run time when we don't meet our own implicit assumptions.

But the assumptions are there whether we've stated them explicitly or not, and not all of the assumptions can be stated formally, even in a language like C++ where types are specified. A function that calculates a square root can probably only be given numeric input, and perhaps that numeric input would need to be non-negative in order to get a real result. In Python, you might start writing the function this way:

def sqrt(n):
    '''Returns the square root of n, which must be a non-negative number'''

with no type information specified formally, but a docstring that makes the unstated assumptions clear. Meanwhile, in C++, we might start it this way instead:

// Returns the square root of n, which must be non-negative
double sqrt(double n)

with a slightly shorter comment, since the signature of the function makes explicit the assumption that the input and output have type double, but nonetheless we still have to document the assumption that n is non-negative, since the type system does not allow us to encode that assumption directly. (Note that it would be possible to state this assumption for an integer type, since we could use unsigned int, but there is no analogous unsigned floating-point type.)

One of the things that separates better programmers from lesser ones is that they're thinking about the things they're not writing explicitly in code, in addition to the things that they are; even better, they're documenting the unwritten assumptions when they aren't obvious (or, even better, making it impossible for users of the function to do the wrong thing, by finding ways to make the assumptions explicit instead). There's value in deciding on good names for the functions you write, but a good name is not necessarily enough; if you can't explain to yourself what kinds of input a function can take, what kinds of outputs it gives for various inputs, and in what ways it might fail, you haven't thought the function's design all the way through.

Below are details about what kinds of things you ought to be thinking about when you design a function or a class, focusing at least partly on what to do about the fact that functions can fail.

Contracts

To formalize what it means to "think a function's design all the way through," we can say that functions (including member functions of classes) have a contract associated with them. Part of that contract is specified formally as the function's signature; in C++, that signature lists the name of the function, the names and types of its parameters, and the type of its result. But a contract is more than a signature; to understand a function's contract, you have to finish your thought about its design. A function's contract consists, additionally, of (at least) the following:

Preconditions, which are the things that must be true before you can call the function and see it succeed. Some of these preconditions might be the values of its parameter (e.g., the sqrt function might require its parameter to be non-negative). Some of them might require other things to be true (e.g., other functions have been called successfully or, in the case of member functions, the object's member variables have to have certain values already).
Postconditions, which are the things that will be true once the function has completed successfully, assuming that the preconditions were all true. In the case of the sqrt function, we would expect to get a return value of type double if the preconditions pass; we would also expect that if we squared that value, we would get a result very close to the parameter's value (allowing some tolerance for the imprecision of doubles).
Side effects, which are the things that happen other than the function computing and returning a result. Examples include dynamically allocating or deallocating memory, writing output to the console or to a file, establishing a connection across a network, popping up a GUI window, and so on.

Classes, too, have a contract associated with them. As with functions' contracts, the contract of a class is partly made up of what's written in the class declaration, as well as the contract associated with each member function. But, additionally, there is one other thing that is included:

Class invariants, which are the things that must always be true about the members of a class after each member function has completed successfully, above and beyond what's specified in the class declaration.

The ArrayList class that I wrote in the Well-Behaved Classes example has a couple of invariants, beyond just the types specified in the declaration:

An ArrayList's capacity will always be at least as large as its size.
The capacity of an ArrayList will always match the number of elements in its underlying array.

Note that there's one other thing we could say: An ArrayList's size and capacity must always be non-negative. But since we declared them as unsigned ints, this is clear already.

A proposed syntax for implementing contracts

Sadly, while we can reason about preconditions, postconditions, and class invariants, there is no syntax in C++ that allows you to write them as part of your program. It should be noted, though, that there have been proposals to add such a syntax to the language, and it's entirely possible that this ability will be added to C++ someday in the future. There would be two benefits if we could encode this information in C++, rather than in comments:

Specifying things like preconditions, postconditions, and class invariants in code instead of in a natural language like English would leave them less open to the ambiguity of natural language, which would improve a human reader's understanding of a program.
Being able to write actual C++ code that describes preconditions, postconditions, and class invariants would allow a compiler to perform some checking at compile time — though, in reality, there would be relatively few such violations that a compiler could reasonably catch — and also to (optionally) generate code that performs the checks at run-time. For example, a call to a function where the preconditions aren't met could simply cause the function to fail with a meaningful error message, rather than trying to muddle through with bad input.

One recent proposal that was considered for C++20 (but ultimately rejected) was one that was numbered P0542 — proposals to change the C++ standard are numbered — suggests a syntax roughly like this, though I've taken some liberties with it. Our sqrt function might be declared this way:

double sqrt(double n)
    [[expects: n >= 0.0]]
    [[ensures: sqrt(n) >= 0.0 && std::abs(sqrt(n) * sqrt(n) - n) < 0.0001]];

Notice that we've explicitly specified that the parameter n has to be non-negative; the expects attribute is suggested as a way to specify preconditions. Meanwhile, we've specified that the result of calling the function, assuming that the preconditions are met, will be non-negative, and that the square of the result will be very nearly equal to the parameter n; the ensures attribute is intended as a way to specify postconditions.

The proposal includes nothing specifically about class invariants, but you could certainly imagine such a syntax. For example, our ArrayList class might be declared this way:

class ArrayList
{
    // ...

private:
    std::string* items;
    unsigned int sz;
    unsigned int cap;
}
[[ensures: items != nullptr && cap >= sz]];

where an ensures attribute on the class could be interpreted as a class invariant. (Why I chose the ensures syntax is that a class invariant can be thought of as a postcondition of all member functions.)

Of course, it's not possible to write these kinds of things today in C++, but even though there is no syntax to support it, we have to be thinking about these things either way. Whether your language lets you encode these things explicitly or not, functions have preconditions, postconditions, and side effects; classes have invariants. Part of how we understand our own designs (and each other's) is to understand these preconditions, postconditions, side effects, and invariants.

Exceptions

In any programming language, when you call a function (or the equivalent), you're asking that function to do a job for you, given a set of parameters that configure that job. There are two possible outcomes:

The function succeeds and returns some kind of result and/or has some kind of side effect.
The function fails to complete its job, in which case it needs to inform you of the failure in some way.

There are different mechanisms for reporting failure in different programming languages, and there is not steadfast agreement in the programming language design community about how best to approach this problem, but one common approach — which appears not just in C++, but in a number of other programming languages, as well — is called an exception.

In C++, the notion of failure is handled separately from success; functions that fail don't return a value at all, but instead throw an exception. (If you've previously programmed in Python, this idea will sound familiar, as it is more or less the same as raising an exception in Python.) How you handle an exception is entirely different from how you handle the return value of a function. The basic model works this way:

Functions either return successfully (and give back a return value, unless their return type is void) or they fail. When they fail, they do so by throwing an exception. The function has no return value in this case; failure is a completely separate mechanism.
C++ guarantees that any local variables in a function that have been initialized will be destroyed when a function ends by throwing an exception. So one large part of ensuring that exceptions don't cause memory or other resource leaks is to favor static allocation over dynamic allocation. If the local variables are pointers, they will be deallocated like any other local variables, but the objects they point to will not.
When a function fails with an exception, its caller will either handle the exception by catching it, or it will fail, too, in which case its caller's caller will have the same choice, and so on. If every function on the call stack, including main(), fails to catch the exception, the program will terminate. (We don't generally want programs to crash, but note that, as a practical matter, there are many fewer places where you'll catch exceptions than throw them; it's much more often the case that the failure of a function also implies the failure of its caller, though there are usually some places in a program where handling the exception can be done appropriately.)
Exceptions are objects and, in C++, they do not have to be particularly special. Any kind of object, including those of the built-in types like int or a pointer, can be thrown. By and large, we don't end up throwing built-in values like that very often, though; better to throw objects that capture the notion of what kind of failure happened and, if necessary, carry more information about the failure with them. The C++ Standard Library even provides base classes such as std::exception that you can inherit from, which can give all exceptions a handful of shared characteristics, such as a (C-style) string containing an error message.
Exceptions are caught on the basis of their type. A function doesn't just say "I know how to handle any error!" Instead, it says "I know how to handle this particular kind of error!" This is why it's important to give an exception a meaningful type; the type is how we distinguish one kind of failure from another.

Exception syntax

Syntactically, there are two things you will need to be able to do to use exceptions in a C++ program: throw them and catch them.

Throwing an exception is best done by simply creating an object of the exception's type and using it as the argument in a throw statement. For example, if you had a class called MyException that had a default constructor, we could throw a MyException this way:

throw MyException{};

Note that we're best off allocating the object statically. If we allocated it dynamically using the new operator, what we'd actually be throwing is a pointer to that object, and, even worse, we'd also be obligating the code that handles the exception to delete it.

Catching an exception is done by specifying a block of code in which you expect an exception may occur, along with an indication of what should happen if it does. This is done with a try block, which, structurally, looks like this:

try
{
    functionThatThrowsMyExceptionSometimes();
}
catch (MyException&)
{
    std::cout << "Doh!" << std::endl;
}

(It should be noted that "try block" is technically the name used to describe the whole construct, including the catch handlers, though some people also sometimes use that to refer only to the area within the try; for clarity, I'll call that the "try part".)

A try block is executed as follows:

The try part is executed first. If no exceptions are thrown, that's it; it completes normally, and the catch handlers are skipped.
If an exception is thrown within the try part, the try block is abandoned immediately. Any of its local variables that have been initialized will be destroyed, just like when we exit any other scope. The catch handlers are then interrogated one by one — there can be as many as you'd like, differentiated by type — in the order specified, until a match is found (where a match is on the basis of type compatibility, just like the arguments to a function). The first match found is executed, after which point the exception is considered handled and control moves just beyond the end of the entire try block.
- Note that exceptions are objects, which means you can interact with them (e.g., by calling member functions or using operators on them). If you need to interact with an exception object, you can give it a name in your catch declaration, then use that name within the catch block whenever you want to refer to the exception that was caught. For example, if the MyException class had a reason() member function that returns a std::string, you could do something like this:
```
catch (MyException& e)
{
    std::cout << e.reason() << std::endl;
}
```
If an exception is thrown in a catch, it's just like any other exception being thrown. Unless there is a try block inside the catch, the exception will be handled by a surrounding try block, or the call stack will be unwound as usual.
Exceptions can be re-thrown in a catch by simply issuing the statement throw;, which is a way to say "I did some work, but this exception isn't completely handled yet; I just wanted to clean some things up before I failed, too." This is actually not all that uncommon, especially in code that has to do a lot of manual clean-up; this technique is sometimes called catch-and-rethrow. (Simplfying this is one of the reasons we should want to less of that kind of clean-up in the first place; we'll have more to say about that a little later in this course.)
It is possible to write a catch handler that is capable of catching anything. You would write it this way:
```
catch (...)
{
    // Do anything you'd like here
}
```
It's important to note, though, that you will not have access to the exception object itself in the catch handler, because it hasn't been given a name, and because there is no single type that encompasses all possible exceptions that might be thrown. Where this technique is primarily of value is when using the "catch-and-rethrow" technique:

catch (...)
{
    // Do cleanup because of the exception, then rethrow it to be handled by the caller

    throw;
}

You may have noticed that, in the examples above, exceptions are being caught by reference. This is the typical practice in C++, as it addresses two problems:

It avoids copies of the exception being made, which can be a performance drag if those copies are expensive to make.
More importantly, it allows us to catch exceptions of derived types without them being "sliced." It's not uncommon for exception types to be specified as an inheritance hierarchy, with less specific types of exceptions acting as the base classes for more specific ones (e.g., something less specific like IOException being the base class for a more specific FileNotFoundException) so the ability to catch a broader type of exception, but still have it behave polymorphically if you call a virtual function on it, is very useful indeed.

Some people also advocate catching exceptions by const reference, like this:

catch (const MyException& e)

which makes clear that you won't be modifying the exception within the catch handler. I haven't picked up that style myself, but I can see at least some benefit to it.

Why destructors should never throw exceptions, and why that makes our life simpler

As we've seen, throwing an exception sets into motion a sequence of events that includes the destruction of a potentially large number of variables — every local variable in the entire chain of functions that have failed, up until we reach a function where the exception is caught. The destruction of these variables is fully automatic and doesn't otherwise affect the process of exception propagation; it's simply a consequence of the natural unwinding of the stack.

But this does bring up an interesting question. What happens if an exception is thrown, a local variable is destroyed automatically (which means that its destructor will be called), and its destructor throws an exception? We now have two exceptions that have occurred: the one we were propagating before, as well as the additional one thrown by some destructor in the process of propagating the first one. While the details here can be subtle, the usual outcome in this case is that the program terminates immediately.

What's more, even when destructors are called because a function has exited normally and its local variables are destroyed, an exception thrown from a destructor can still cause a program crash. Unless you take special care to say otherwise, destructors in C++ have the property that they are noexcept, which means they are not permitted to throw exceptions, and that throwing one will cause the program to terminate immediately. While it's possible to turn off this default, there's rarely a reason to do so; destructors that throw exceptions would still be problematic in the case when they throw exceptions while unwinding the stack due to the propagation of another exception.

All in all, the best advice is that we should never throw exceptions in destructors, because destructors can be called in circumstances (such as the stack unwinding that happens while exceptions propagate) that will cause a program to crash immediately if the destructor throws an exception. Designing our classes so that destruction cannot fail turns out to be paramount.

Why this fact makes our life simpler is that we can make a basic assumption: If we use the delete or delete[ ] operators, it's safe to assume that they won't throw an exception. This can make it easier to design code that handles exceptions correctly and safely.

Exception safety

If you've programmed in languages like Python or Java before, a lot of this will look quite familiar; the keywords might be different (e.g., Python has a try..except statement), but the ideas are pretty similar. However, exceptions are a little thornier than they are in a lot of languages, because C++ requires manual management of resources (e.g., memory, open files, and so on). More care needs to be taken to ensure that, for example, an exception being thrown does not cause a memory leak or leave a dangling pointer in its wake.

Furthermore, once we start thinking in terms of contracts, we realize that even less complex languages like Python or Java still require us to be sure that we don't break contracts when we throw exceptions. Do the appropriate side effects happen (or are they avoided) when a function throws an exception midway through its execution? If a member function of a class fails before it ends normally, are all class invariants still preserved? We say that exception safety is the principle of ensuring that we have reasonable outcomes in cases when exceptions are thrown.

This can be an overwhelming thing to think about if you've never considered it before, but, particularly in the case of member functions of classes, the C++ Standard Library provides a good mental model of how to think about these kinds of issues. Member functions of classes like std::vector are documented to make one of the following four guarantees about what happens when an exception is thrown. (The guarantees get progressively stronger as you read further down the list.)

No guarantee, which means that if a member function throws an exception, all bets are off. Class invariants may no longer hold, memory may have leaked or been corrupted arbitrarily, and so on. It may be a good idea for the program to attempt the most graceful exit possible at this point, since there is no way to know the extent of the damage that might have been done.
The basic guarantee, which means that if a member function throws an exception, the object that the member function was called on will be left in a consistent state (i.e., all of its class invariants will be met), though the object may have been altered. Furthermore, if an exception is thrown, memory and other resources will not have leaked.
The strong guarantee, which means that if a member function throws an exception, the object's state will be identical to what it was before the function was called. (This is also sometimes called the rollback guarantee.) As with the basic guarantee, memory and other resources will not have leaked. In general, this is a better guarantee to make than the basic guarantee, but is sometimes impractical because of cost.
The nothrow guarantee, which means that the member function guarantees that it can never throw an exception. It is unavoidable for some functions to throw exceptions sometimes — e.g., any function that dynamically allocates memory might throw a std::bad_alloc if that allocation fails — but plenty of functions (like the ArrayList::size member function in the ArrayList example from the Well-Behaved Classes notes) can be written in a way that guarantees they won't.

Ideally, our goal should be to provide the strongest of these guarantees that we can, provided that it's possible and that the cost doesn't outweigh the benefit.

The nothrow guarantee is clearly not something we can provide all the time. A lot of functions, even perfectly-written ones, throw exceptions in some cases, because, for example, they may rely on external resources like heap memory, files, networks, and so on.
The strong guarantee can be an expensive guarantee to provide, sometimes resulting in an operation that has different complexity characteristics (e.g., O(n) instead of O(1) running time, or O(n) instead of O(1) memory). In a case like that, it's usually better not to provide the guarantee, but instead to design a way that the guarantee can be obtained when needed (e.g., by copying a data structure, modifying the copy, then swapping it back into place with the original one if everything succeeded).
We should always endeavor to provide the basic guarantee, though. There are few excuses for giving up on this one, because the alternative is unpredictable or undefined behavior or even program crashes in the event that an exception is thrown. Exceptions happen, whether we like it or not, so best not to have them mushroom into much bigger problems that are more difficult to diagnose and fix.