ICS 46 Spring 2017
Notes and Examples: Amortized Analysis

Analyzing the behavior of std::vector

In previous coursework, you've likely used the std::vector class template in the C++ Standard Libary. Vectors implement the abstract notion of a list — a sequence of elements — by storing its elements in an array. When there are n elements stored in the vector, they're required to be stored in the first n cells of the array, with any additional capacity in the array left available when additional elements are added. At any given time, you can ask a vector its size (i.e., how many elements it currently stores) and its capacity (i.e., how big the underlying array is).

Based on our previous understanding of arrays, we can do some quick asymptotic analysis on various std::vector operations, so we can understand their performance impact.

• Accessing the element at index i. Any particular element of an array can be accessed, given its index, in Θ(1) time. Since the elements of a vector are stored in an array, the same fact applies here. Generally, if you know where something is in an array or a vector, you can get to it in constant time.
• Getting the size or the capacity. Vectors store this in separate member variables, so getting their value is a simple matter of accessing them. These operations run in Θ(1) time, as well.
• The front member function, which returns the first element in the vector. The first element will always be in cell 0, so this also runs in Θ(1) time (i.e., if you know where something is, you can get there in constant time).
• The back member function, which returns the last element in the vector. The last element will always be in the cell with index size − 1, so this, too, runs in Θ(1) time.

As you consider these operations, it looks like vectors are an unvarnished win; everything looks like it runs in constant time! But let's consider one of the most important member functions: push_back (or its similar cousin, emplace_back). The answer here is trickier, because there are two scenarios:

• size < capacity, in which case the array has at least one unused cell into which the new element can be stored. Furthermore, we'll know that the unused cell has index size, so we can get to it in Θ(1) time; we'll also need to increment size, which is also a constant amount of time. So far, so good!
• size = capacity, in which case the array is full. You may have noticed in previous usage that vectors automatically handle this case for you, but you may not know what it actually does, which is important if we want to understand how well this operation performs.
• First, a new array — a multiple of the capacity of the original — is created. (There is no explicit rule about what the new array's capacity will be, but you can think of it as being twice as big for the purposes of this discussion; in general, as long as the size increases multiplicatively, our analysis will ultimately hold true.)
• Next, the elements in the original array are copied, one by one, into the new array.
• Finally, the old array can be deleted, in which case the vector will now contain only the new array, which has available capacity; at this point, the new element can be added normally.
The important question, then, is how long this takes. Let's break it down into its component parts:
• The time it takes to allocate memory is not explicitly a function of the amount of memory you're allocating, since C++ generally doesn't initialize the contents of newly-allocated memory (e.g., with zeros). It's certainly true that larger allocations can be more difficult to do than smaller ones, but it's not an explicit function of the array's capacity, so we'll assume that this takes Θ(1) time.
• Copying the elements from the old array into the new array requires copying all of them. If the vector contains n elements, this takes Θ(n) time, since each has to be copied individually. We'll say that copying each element takes Θ(1) time because the size is not a factor that affects how long each element will take to copy (though it's certainly true, in practice, that the elements would take longer to copy if they're larger than if they're smaller).
• Deletion, like allocation, is not a function of the amount of memory; in general, deletion can be thought of as a constant-time operation.
• Finally, when the new element is added to the newly-allocated capacity, that takes Θ(1) time, as we saw before.
Adding all of that together, we see that the total is Θ(n).

All of this leaves us with an interesting problem. What can we say, definitively, about the performance of push_back?

• Generally, the strongest statement we can make about push_back is that it's O(n), because it might require linear time (i.e., if the array is full), though it also might require substantially less (i.e., if there's available capacity).
• In the best case, we can say that push_back runs in Θ(1) time.
• In the worst case, we can say that push_back runs in Θ(n) time.

So, ultimately, our analysis is a bit of a mixed bag: optimistic in parts, while pessimistic in others. But, in reality, which is it? Missing from our analysis is a notion of how often the worst-case scenario happens, relative to the best-case scenario. Is it something we should be concerned about? If so, how much?

To answer these questions, we'll need a new kind of analysis we've not yet done, which is called amortized analysis.

Amortized analysis

In English, the word amortization is often used to describe spreading out large costs over time, so their impact in the short term is minimized. For example, many people buy homes without having the cash saved to pay for the full cost; they do this by borrowing the money and paying it back gradually, amortizing the cost of the home over a long period of time (sometimes as long as 30 years, or occasionally even longer). This isn't without its downsides, of course; as you pay the money back, you pay an additional interest fee to the lender, which compensates them for the effect of inflation — the money you pay them back with will be worth less than the money you borrowed — as well as the risk that you might not pay the loan back. The interest can be substantial; with prevailing interest rates as of this writing, a typical 30-year home loan would require you to pay a total of around 165% of the cost of the home, albeit spread over a 30-year period. Nonetheless, by spreading the cost over such a long period, it becomes possible for people of fairly modest means to purchase a home if they choose.

This notion of spreading large costs over a period of time can be applied to the analysis of algorithms and data structures, as well. It's certainly true, for example, that std::vector's push_back operation can sometimes take linear time to run; however, it's also true that it won't happen repeatedly, because the newly-allocated array will be twice the size of the original one. The larger the vector gets over time, the more it will cost to perform the occasional reallocation, but the less often the reallocation will be necessary going forward, because we obtain ever larger amounts of additional capacity each time. This fact leads to a surprising conclusion, though we need a new technique, amortized analysis, to see it.

Amortized running time

We say that the amortized running time of an operation is the worst-case running time of a sequence of operations divided by the number of operations in the series. At first blush, that sounds like an average, but there's a nuance here that's important to understand: We're only talking about sequences of operations that can actually happen. push_back won't run in Θ(n) time every time you call it, so we need not consider that circumstance; instead, we should focus our attention on sequences that actually arise in practice.

To perform an analysis of push_back, let's consider a sequence of calls to it. To start with, we'll assume that both the size and capacity are c, then we'll consider how long it will take for a sequence of subsequent push_back calls to run, with the goal of measuring the average cost of each call, so we can see how that average changes as the vector grows. As a proxy for the amount of time required, we'll count the number of times that we write a value into a cell of an array, which is a mainly where the cost of push_back lies.

 Calls Size Capacity Total Writes Avg. Writes Commentary 0 c c 0 We'll begin measuring with a vector whose array is full. No work has been done yet. 1 c + 1 2c c + 1 The first call to push_back requires a reallocation, since the array is full. That means we need to copy all c elements from the old array into the new one (c writes), then add the new element (one more write). c 2c 2c 2c 2 The next c − 1 calls to push_back will be cheap ones — each requiring a write into a single cell — since there is available capacity in the array. And note that, at this point, the average cost of the calls to push_back is 2 writes. c + 1 2c + 1 4c 4c + 1 The next call requires a reallocation. Note that this reallocation took about twice as long — copying 2c elements instead of c — as the last one. 3c 4c 4c 6c 2 Having reallocated, the next 2c − 1 calls are cheap ones that each require a single write. Note, importantly, that the average cost is back to 2 writes again. 3c + 1 4c + 1 8c 10c + 1 Another expensive call requiring a reallocation, but that will buy us even more cheap calls afterward. 7c 8c 8c 14c 2 What do you know? Once again, after using up all of the cheap calls we got after an expensive one, our average has dropped again to 2 writes.

This same pattern would continue going forward. In general, the average cost of each push_back operation in a long sequence of them is a constant!

So we would say that the amortized running time of push_back is Θ(1).

How amortized running time is different from worst-case

It's important to note that this analysis doesn't change the reality that push_back has a worst-case running time of Θ(n), because an expensive reallocation might be required. Knowing that it has an amortized running time of Θ(1) tells us that if we call it repeatedly, the cost will smooth out over time, so that a long sequence of calls will appear to have each been inexpensive. But there's a difference between the cost "smoothing out over time" and the cost being constant in the worst case, which makes a practical difference that we should be cognizant of.

For a lot of real-world uses, an amortized running time of Θ(1) is good enough, because all we care about is how long it takes to run a long task that involves lots of calls to push_back. On the other hand, there are realistic scenarios where keeping the worst-case running time low is vital. Here are a couple of examples:

• Robotics control software, with hard real-time requirements (i.e., we'd better be sure that a certain part of our code doesn't take longer than we expect, so we're sure that motors are turned on or off precisely when they should be).
• A video game with 3D graphics, where we might want to ensure that 60 frames are drawn every second, no matter what.

Understanding the actual requirements — do you need every operation to be fast, or only for the average of all of them to be fast? — can help you to determine whether a good amortized running time is enough for your uses.

Why it's important that the resizing is done multiplicatively

We saw above that a typical implementation of std::vector will resize the array to a multiple of its size whenever push_back is called when the array is full. This isn't a cavalier choice; it's actually a requirement, because it's the fact that makes the analysis above work out the way it does. The spirit of it is that the resizings cost more as the array gets larger, but each one buys you proportionally more cheap calls to push_back afterward.

What if we, say, add 10 elements to the size of the array every time instead? Do we get that same benefit? Let's do the same analysis and see what happens.

 Calls Size Capacity Total Writes Avg. Writes Commentary 0 c c 0 We'll begin measuring with a vector whose array is full. No work has been done yet. 1 c + 1 c + 10 c + 1 The first call to push_back requires a reallocation, since the array is full. That means we need to copy all c elements from the old array into the new one (c writes), then add the new element (one more write). 10 c + 10 c + 10 c + 10 c/10 + 1 The next nine calls to push_back will be cheap ones — each requiring a write into a single cell — since there is available capacity in the array. Note, too, that the average number of writes per call at this point is linear with respect to c; this is our first bit of bad news. 11 c + 11 c + 20 2c + 21 The next call requires a reallocation. Note that this reallocation took longer than the last one, because we had to copy c + 10 elements (and then add the new one) this time. 20 c + 20 c + 20 2c + 30 c/10 + 3/2 Having reallocated, the next nine calls are cheap ones that each require a single write. Note, importantly, that the average cost is back to being linear again. This is not an accident. 21 c + 21 c + 30 3c + 51 Another expensive call requiring reallocation, but we're only going to get nine cheap calls to push_back in return for the price we paid. Every time we do this, we're paying more and getting back the same return. 30 c + 30 c + 30 3c + 60 c/10 + 2 And so we see, on the average, that we're still spending linear time per call to push_back.

So we would say that the amortized running time of this version of push_back — one that adds a constant number of elements to the array's size every time it reallocated, rather than multiplying it — is Θ(n).

While the C++ Standard is actually silent on the precise implementation details of std::vector, it does call out the amortized constant running time as a requirement, which means that you can be fairly certain that push_back will multiply the capacity of the array in practice whenever it fills.