ICS 46 Spring 2022, Notes and Examples: Stacks, Queues, and Deques

Identifying the right data structure

In previous coursework, you've no doubt used at least one programming language — C++ or something else — enough that you've been exposed to the data structures that are provided by the language or its standard library. For example, C++ provides a number of data structures: arrays, std::vector, std::list, and std::map, to name a few. Python, meanwhile, provides choices like lists, sets, tuples, and dictionaries. None of these is definitively better than any of the others; the objective is to choose the data structure that best matches the "shape" of the problem you're trying to solve. If you want to store a sequence of values in Python, a list is a good choice. If you want to keep track of information about students, each indexed uniquely by an ID, in C++, you might choose a std::map, with the keys being the IDs and the values being objects of a Student class.

Generally, then, when we learn about a new data structure we've not seen before, there are a few things we want to understand about it:

What kinds of problems is it intended to solve? What does it look like, from a conceptual perspective?
What are the common operations (i.e., the things that any reasonable implementation of the data structure would provide), and what do they do, from a conceptual perspective?
How would the implementation of the common operations actually work? If you need to store data in memory, for example, how would you organize it, and how would you find and manipulate it as it changes?
How well would the implementation of the common operations work? We should do some analysis (e.g., asymptotic analysis) of our implementation choices, to be sure that we've made good ones, and to be able to compare different implementations, so we can select one that performs well enough for our needs.

If we can get a handle on all of those issues, we'll have a good understanding of when our new data structure might solve a real problem we have, as well as how we should build it (if our preferred programming language doesn't provide one), and how well it will perform in the ways that we plan to use it.

To get our minds around this kind of analysis, let's consider a few of the classic linear data structures, which let you store and manipulate sequences of elements, with the understanding that certain ones, at any given time, are more important than the others.

Stacks

A stack is a data structure that stores a sequence of elements, with the key property being that the most important one, at any given time, is the one that was stored the most recently. If we were to draw a diagram of a stack, which focused on the key concepts, it might look something like this:

Stack (Conceptual View)

The diagram shows the most important conceptual ideas underlying stacks:

The most important element is the one on top.
The common operation push adds a new element to the top.
The common operation pop removes an element from the top.
The common operation top lets us view the topmost element without removing it.

In a practical implementation, you might find other operations, like one that tells you how many elements are in the stack, or one that tells you whether the stack is empty, but push, pop, and top are the most important ones, so we'll focus on those.

If we have a stack-shaped problem and we need to implement one, we have a sequence of elements to store; our first order of business is choosing an underlying data structure in which to store them. There are two obvious choices: an array (or an array-based structure, like a std::vector) and a linked list.

Implementing a stack using an array or std::vector

Let's first consider how you might implement a stack using an array-based data structure, such as an array or a std::vector. It seems sensible that we'd want to store the elements in the order in which they appear in the stack, though an open question is whether they should be stored in top-to-bottom or bottom-to-top order. Let's think about each option.

If we store the elements in top-to-bottom order, that means the top of the stack would always be stored in cell 0 of the array, the element just beneath the top would be stored in cell 1, and so on. We'd presumably also need to keep track of how many elements were stored (necessary if we had an array, done automatically by std::vector if we use it). This approach has the upside that we always know where the top element is — it'll always be in cell 0, no matter what. To understand whether this is a good approach, though, we should do some analysis; we'll analyze the three common operations, assuming there are n elements stored in the stack already.

top: The topmost element will always be in cell 0, which we'll be able to access in constant time, no matter how many elements are stored. So, Θ(1) time.
push: The newly-pushed element will need to be stored in cell 0, but we'll first need to slide everything over to make space. This will require Θ(n) time.
pop: After removing the top element, everything else will need to slide backward to fill the open space. This, too, will require Θ(n) time.

The fundamental problem is the requirement that the topmost element is always in cell 0; the "business end" of the stack being in a fixed position requires us to slide elements back and forth. But if we flip the elements around, instead storing the bottommost element in cell 0, we could always easily infer the position of the top element by looking at the stack's size (which we'd be tracking separately). Is this better? Let's take a look, again assuming there are n elements stored in the stack.

top: The topmost element will always be in cell indexed by size − 1, which we could find in constant time. Still Θ(1).
push: The value of size would tell us where to put the new element, then we could increment the size. In an array, this would take Θ(1) time. In a std::vector, it's a little more complicated, because this sometimes requires a reallocation (when the std::vector has run out of space); we'll talk more about that in a later set of notes.
pop: Similarly, removing from the end of an array-based data structure, given that we know its size, is a Θ(1) operation — blanking/nulling out some cell and then decrementing the size.

So, the simple act of reversing the order of the elements made a huge difference! This is why it's so important for us to do a little bit of analysis, because we quickly discover that reasonable-sounding ideas ("If the top of the stack was in cell 0, we'd always be able to find it!") sometimes lead to sub-par solutions.

Implementing a stack using a linked list

If we instead opt to implement a stack using a linked list, our first order of business is selecting what kind of linked list we should use. We saw previously that there are variants on the idea of a linked list. The key question is this: What is the simplest kind of linked list that still yields the best possible performance for the common operations?

So, let's start with the simplest kind of linked list we know: a singly-linked list with head. What we've seen is that certain operations, notably the ones that work near the front of the list, are performant; others less so. So if we could make sure that we were only ever modifying the front of the list, we'd be in good shape.

That turns out to be an easy problem to solve, by arranging the elements so that the top of the stack is in the first node of the list, with the others stored sequentially afterward. If there are n elements in our stack, we'd see the following performance characteristics.

top: Getting the value in the first node is a constant-time operation. So, Θ(1).
push: This would equate to adding a node to the front of the list. Also Θ(1).
pop: This would equate to removing the node from the front of the list. Θ(1).

The moral of the story is that the simplest kind of linked list would be sufficient to solve this problem reasonably well. That isn't to say that we couldn't use a more complex one — if, for example, we already had one implemented and tested — but knowing the minimum requirement tells us what we would have to build, at minimum, if we had to build it ourselves, or if we could select from many different prebuilt implementations.

Queues

A queue is a data structure that stores a sequence of elements, with the key property that the most important one, at any given time, is the one that was stored the least recently. Elements "wait in line" in the order they arrive. If we were to draw a diagram of a queue, which focused on the key concepts, it might look something like this:

Queue (Conceptual View)

The diagram shows the most important conceptual ideas underlying queues:

The most important element is the one in the front.
The common operation enqueue adds a new element to the back.
The common operation dequeue removes an element from the front, which will naturally be the one that's been in the queue the longest.
The common operation front lets us view the front element without removing it.

As with stacks, a practical queue implementation might include other operations, such as the ability to determine how many elements are stored, but enqueue, dequeue, and front are the most important ones, so we'll focus our attention on those.

If we've got a queue-shaped problem and need to implement one ourselves, we'll need to map the concept to an implementation technique; importantly, we'll need to choose an underlying data structure in which to store the elements, and we'll need to arrange the elements in that data structure to provide performant operations.

Implementing a queue using a linked list

As we did when we were considering how to implement a stack, we should decide which is the simplest kind of linked list that will lead to a favorable outcome; ideally, we could find a way to implement all three of the common operations so they could run in Θ(1) time.

Your intuition should kick in quickly when you consider whether a singly-linked list with head would provide a good solution to this problem. Queues have two "business ends"; we enqueue at the back and dequeue from the front. So, without thinking very hard about it, we can eliminate a singly-linked list with head from consideration, because it doesn't provide constant-time access to both ends of the list, which means that either enqueues or dequeues will be more expensive than they ought to be.

A singly-linked list with head and tail, on the other hand, at least seems like it has a chance at providing Θ(1) enqueue, dequeue, and front operations, because it at least gives us constant-time access to both ends of the list. But we have to think through what we can and can't do in Θ(1) time, given a singly-linked list with head and tail:

Access the data in the first node
Access the data in the last node
Add to the front of the list
Remove from the front of the list
Add to the end of the list

Notably absent is the ability to remove from the end of the list, which requires Θ(n) time in a list with n nodes. So we'd best avoid that operation. Nonetheless, if we choose our organization of elements in nodes carefully, we can still achieve the outcome we want. If we store the front element of the queue in the first node, the back element of the queue in the last node, and the remaining nodes (in the proper order) in between, then we're in business.

front: Access the data in the first node of the list. Θ(1) time.
enqueue: Add a new node to the end of the list. Θ(1) time.
dequeue: Remove a node from the beginning of the list. Θ(1) time.

Implementing a queue using an array or std::vector

We should consider, too, how you might use an array-based data structure, like an array or a std::vector, to implement a queue. Perhaps surprisingly, this is more complicated than it sounds, so we'll need to exercise some caution and think through some of the details carefully.

First of all, we can eliminate std::vector from consideration almost immediately. Used the way it's intended to be used, a std::vector stores its s elements into an underlying array of capacity c, where c ≥ s. If you ask it for its size, you'll get back s; if you ask for its capacity, you'll get back c. And the s elements will always be stored in the first s cells of the underlying array, the first one in cell 0, the second in cell 1, and so on.

So why does this eliminate std::vector as a good choice for solving this problem? Consider how you might solve it.

Store the elements of the queue in front-to-back order in the std::vector, meaning the front element of the queue will always be in cell 0, while the back element of the queue will always be in cell s − 1.
Store the elements of the queue in back-to-front order in the std::vector, meaning the back element of the queue will always be in cell 0, while the front element of the queue will always be in cell s − 1.

Regardless of which of these you choose, you'll either have expensive enqueues or expensive dequeues:

If the elements are stored in front-to-back order, dequeuing would involve removing the first element from the std::vector, requiring all subsequent elements to be slid back to fill in the space. This will take Θ(s) time.
If the elements are stored in back-to-front order, enqueuing would involve adding a new first element to the std::vector, requiring all of the existing elements to be slid forward to create an empty space. This will take Θ(s) time.

While there are ways we could "abuse" std::vector to solve our problem, if we use it the way it's intended, the constraint that the s elements must be stored in the first s cells of the underlying array leads us to an impasse; there's simply no good solution to the problem using this tool.

If, instead, we consider arrays more generally, a somewhat different solution emerges. What if, instead of forcing the front element to be stored in cell 0, we used an array, allowing us to relax that requirement? In that case, we could keep track of the index of the front element of the queue, as well as the index where an element would be added to the back of the queue, and let those "float" over time as elements are enqueued and dequeued. For example, the array-based queue below contains the three elements x, y, and z:

0	1	2	3	4	5	6
	x	y	z
	↑ f			↑ b

In the diagram, f depicts the index of the front element and b depicts the index where a new element would be added to the back; you'd store these values as unsigned integers. Given this scenario, the portion of the array that contains elements actually in the queue begins with the element where f refers and ends with the element preceding where b refers.

Enqueuing an element is simple, then: Store the element in the cell where b refers, then add 1 to b. Dequeuing is similar: Clear the value in the cell where f refers, then add 1 to f. Since arrays provide constant-time access to a cell, given its index, these will be Θ(1) time operations. So far so good!

However, there are some wrinkles worth considering. Given this implementation, both f and b will move forward over time. So, eventually, you could end up in a situation like this one:

0	1	2	3	4	5	6
		w	v	t	s
		↑ f				↑ b

Now suppose you want to enqueue another element. What do we do with b? If we add 1 to it, it'll be pointing beyond the boundaries of the array, so the next enqueue will fail. So, instead, we "wrap" it back around to the beginning of the array:

0	1	2	3	4	5	6
		w	v	t	s	r
↑ b		↑ f

And we slightly redefine our idea of which elements are actually part of the queue, so that if b refers to a cell with a smaller index than f, everything from f through the end of the array and anything in a cell preceding b will be considered part of the queue. Just as b can wrap around when it reaches the end and we enqueue an element, f can wrap around when it reaches the end and we dequeue one. Over time, f and b chase each other around the array forever, and it becomes possible to enqueue and dequeue elements indefinitely, as long as we never dequeue from an empty queue or enqueue into a full one.

There's one last wrinkle. How do we detect that the array is full? Let's consider what it looks like just before it fills.

0	1	2	3	4	5	6
q		w	v	t	s	r
	↑ b	↑ f

So what would happen if we enqueued one more element? We'd now be in the following situation.

0	1	2	3	4	5	6
q	p	w	v	t	s	r
		↑↑ f b

This suggests that we can detect "fullness" when f = b. But what about emptiness? Just before the queue is empty, it would look like this:

0	1	2	3	4	5	6
		q
		↑ f	↑ b

If we dequeued an element, we would end up here.

0	1	2	3	4	5	6

			↑↑ f b

And, suddenly, it becomes evident that f = b might also mean that the queue is empty. So how do we tell the difference? The answer lies in one simple tweak: We also keep track of the size of the queue at all times (i.e., the number of elements we're currently storing in the queue), incrementing and decrementing it as we enqueue and dequeue elements. This gives us not only a simple solution to the problem of finding out how many elements we're storing, but it forms the basis of how we detect fullness and emptiness, by simply comparing size to the capacity of the array (which, in C++, we would also need to track separately).

This solution, generally, is called the circular array implementation of a queue, and is common in both software and hardware, particularly when there is a fixed amount of memory available.

Deques (Double-Ended Queues)

I expect there's a good chance you've seen stacks and queues before, but there is another, slightly lesser-known "classic" linear data structure that you may never have seen, which is called a deque, an oddly-spelled name that is intended to be shorthand for double-ended queue. The name "double-ended queue" is a pretty accurate description of what it is; it acts like a queue in which you can treat either end as the front and either end as the back, leading to the following common operations.

enqueueFront: Add an element in front of all the others.
enqueueBack: Add an element behind all the others.
dequeueFront: Remove an element from the front.
dequeueBack: Remove an element from the back.
front: Access the element at the front, without removing it.
back: Access the element at the back, without removing it.

While deques don't themselves fit as many "real-world" problems as stacks and queues, one very useful application of deques is as an underlying data structure on top of which others (notably, stacks and queues) can be built. For example, the C++ Standard Library contains a std::deque class template for exactly this reason.

Implementing a deque using a linked list

Because a deque requires us to be able to add and remove elements at either end of the list, we'll need a doubly-linked list with head and tail to do so efficiently, because that's the only linked list variant we've seen that allows us to remove from either end in Θ(1) time.

Implementing a deque using an array or std::vector

Like queues, deques can't be effectively implemented using std::vector as it was intended to be used, because one end of the deque or the other would be required to be in cell 0 at all times.

However, we could do something very similar to the circular array implementation of a queue. The only difference is that both f and b would be able to move in either direction, because both enqueuing and dequeing are possible at both ends of the queue. Otherwise, the implementation would be very similar to the one we saw previously.