ICS 45C Spring 2022
Notes and Examples: Separate Compilation

Includes a code example with the moniker SeparateCompilation


Background

For the most part, writing programs in almost any programming language is done by writing the programs' text into individual files. In lecture, we've seen programs entirely written in a single file called main.cpp; we call these .cpp files source files. As is always the case when programs are written in files, as C++ programs grow beyond a certain size, it becomes important to be able to split them up into separate source files of a practical size, rather than writing them as one giant source file. The mechanisms provided for this in C++ require a bit more painstaking effort than what you might have seen in other programming languages, requiring you to more carefully arrange your code and to understand more precisely how the compiler works.

Part of why C++ has a clunkier mechanism for tying the code in separate source files together is due to a certain amount of historical baggage. In particular, C++ compilers only compile a single source file at a time, with no visibility into other source files — a reality motivated largely by a distant past in which available memory and processing power were orders of magnitude smaller than they are today. However, C++ compilers are required to do static type checking, meaning that they verify that all uses of variables, functions, etc., must use the correct types (e.g., only a value compatible with an integer can ever be stored in an int variable). Where it gets tricky is when the compiler needs to verify the use of something in one source file that's defined in another one, a problem which you'll encounter almost immediately once you write a program with multiple source files. To understand the mechanism for handling this problem, we need a slightly deeper understanding of how a C++ compiler works.

Declarations and definitions

Broadly, C++ programs are built out of two things: declarations and definitions. Recall the distinction:

Compilation and linking

Unlike in some programming languages, a C++ program typically needs to be "built" (i.e., an executable version of the program needs to be constructed) before it can be executed. The process of building a C++ program occurs, broadly, in two phases: compilation and linking.

Understanding these steps leads us to three rules that we'll need to follow:

As you might expect, a pattern evolved in C++ for dealing with these three rules. If you follow the pattern, none of these three rules will be violated.

Source and header files

To divide our C++ programs into what you might call "modules" — separate groupings of closely-related functions, classes, etc. — we write our code in two kinds of files: source files and header files.

Splitting our code up this way might seem like a bit of a burden, but the necessity arises from the rules we specified above.

Naming conventions for source and header files

In this course, when we write source files, their names will end in .cpp, while header files will have names that end in .hpp. Note that this is not the only convention in popular use in the world — unlike some programming languages, C++ compilers aren't especially finicky about file naming — but we'll use it, because it (a) establishes a clean distinction between header and source files, and (b) makes clear (with the "pp" on the end, a filesystem-compatible way of saying "++") that the code we're writing is C++ and not C.

If you prefer other naming conventions, that's fine for your own work, but you'll need to use .cpp and .hpp in this course, because the build and test tools for this course assume that you are. (Like many details about tools and techniques, you don't always get to choose what you want when you work on someone else's project; that's worth getting used to.)


Deciding on the boundaries between source files

Now that we've talked about the mechanism for splitting up a program into multiple source files, there's one more thing we need to nail down: How do we decide on the boundaries between them? What are we trying to accomplish when we split up a program, and how finely-grained does that split need to be?

One of the hallmarks of well-written software is what is sometimes called separation of concerns, the principle that you should handle separate issues in separate parts of a program, rather than munging everything together into fewer, larger, more complex functions. (This is one of the reasons that global variables are often said to be so problematic; they munge things together by their nature, spreading a part of your program's knowledge throughout the entire program.)

There are a few ground rules you should follow when you're designing a C++ program — and, in truth, these aren't that different from what you ought to be doing in just about any programming language. Header and source files are the C++ mechanism for doing something you'd want to do in any programming language: Break a large program into its component parts.


The code

The official moniker for this code example is SeparateCompilation, so your best bet is to do this:

Alternatively, you can click the link to the tarball below: