ICS 46 Spring 2022
Exercise Set 7

Due date and time: Wednesday, May 25, 11:59pm


Getting started

First of all, be sure that you've read the Reinforcement Exercises page, which explains everything you'll need to know, generally, about how the reinforcement exercises will work this quarter. Make sure you read that page before you continue with this one.

Before you begin work on these reinforcement exercises, there's a chore that you'll need to complete on your ICS 46 VM to get it set up to proceed.

Refreshing your ICS 46 VM environment

Even if you previously downloaded your ICS 46 VM, you will probably need to refresh its environment before proceeding with these exercises. Log into your VM and issue the command ics46 version to see what version of the ICS 46 environment you currently have stored on your VM. Note, in particular, the timestamp; if you see a version with a timestamp older than the one listed below, you'll need to refresh your environment by running the command ics46 refresh to download the latest one before you proceed with these exercises.

2022-05-19 07:22:17
Exercise Set 7 template added

If you're unable to get outgoing network access to work on the ICS 46 VM — something that afflicts a handful of students each quarter — then the ics46 refresh command won't work, but an alternative approach is to download the latest environment from the link below, then to upload the file on to your ICS 46 VM using SCP. (See the Project #0 write-up for more details on using SCP.) Once the file is on your VM, you can run the command ics46 refresh_local NAME_OF_ENVIRONMENT_FILE, replacing NAME_OF_ENVIRONMENT_FILE with the name of the file you uploaded; note that you'd need to be in the same directory where the file is when you run the command.

Creating your project directory on your ICS 46 VM

A project template has been created specifically for this set of exercises, containing a similar structure to the basic template you saw in Project #0. Among other things, it contains a version of the gather script that's different from the ones in the projects, so you'll absolutely need to use the set7 template for these exercises, as opposed to the basic one.

Decide on a name for your project directory, then issue the command ics46 start YOUR_CHOSEN_PROJECT_NAME set7 to create your new project directory using the set7 template. (For example, if you wanted to call your project directory set7, you would issue the command ics46 start set7 set7.)


Problem 1 (2 points)

When we learned about the Treesort algorithm, we saw that there was incompatibility between binary search trees and the keys we might like to sort: The keys might not be unique, but binary search trees operate under the assumption that keys are always unique (or, stated differently, if the same key is used more than once, it's assumed to be the same key).

Propose how to solve this problem with Treesort; briefly explain what you would do to ensure that you obtained the correct result, even though binary search trees don't allow keys to be duplicated. Rememeber that we're not just sorting numeric keys, per se; we're sorting any one kind of object based on some characteristic of them, such as sorting strings by their length, employees by their years of education, or whatever, in which case our output would be the strings, the employees, and so on, not the characteristics by which we're sorting them.

What to submit

Add one PDF file to your core directory with this name: problem1.pdf, which contains your answer to this question.


Problem 2 (2 points)

We've seen that it's possible to use data structures to help us to design sorting algorithms. Let's consider a couple of possibilities that we didn't cover in class and decide whether they might be viable options.

  1. The algorithm SkipListSort is a sorting algorithm that begins by inserting a sequence of keys into a skip list.
    1. Once you had done that, what would the remaining steps in the algorithm be? (You don't need to implement them; just briefly describe in English how the algorithm would finish.)
    2. Give an asymptotic notation that best describes how long it would take to sort n keys this way (i.e., the entire SkipListSort algorithm) and briefly explain (in a sentence or two) why.
  2. The algorithm HashSort is a sorting algorithm that begins by inserting a sequence of keys into a hash table.
    1. Once you had done that, what would the remaining steps in the algorithm be? (You don't need to implement them; just briefly describe in English how the algorithm would finish.)
    2. Give an asymptotic notation that best describes how long it would take to sort n keys this way (i.e., the entire HashSort algorithm) and briefly explain (in a sentence or two) why.

What to submit

Add one PDF file to your core directory with this name: problem2.pdf, which contains your answers to the questions about these two algorithms.


Problem 3 (2 points)

When we learned about general trees, one important algorithm we learned about were tree traversals, of which we talked about two broad categories: depth-first and breadth-first. Meanwhile, this week, we talked about a general category of algorithm called a divide and conquer algorithm. Let's consider whether these two ideas fit together and, if so, how.

Suppose, then, that we had a general tree and that each node stores one integer value. Suppose, too, that we wanted to use tree traversal algorithms to find the maximum of all of the values stored in the nodes.

  1. If we used a depth-first traversal to find the maximum value, would this be considered a divide and conquer algorithm? Briefly explain why or why not.
  2. If we used a breadth-first traversal to find the maximum value, would this be considered a divide and conquer algorithm? Briefly explain why or why not.

What to submit

Add one PDF file to your core directory with this name: problem3.pdf, which contains your answer to these two questions.


Problem 4 (2 points)

One factor in algorithm performance that we've not had a chance to speak much about this quarter, but one that is certainly relevant in practice, is parallelism, which refers to the ability to speed up an algorithm by doing more than one thing at once. Nowadays, with most personal laptop or desktop computers having multicore processors, and with many workloads running on cloud infrastructure (i.e., potentially large numbers of separate machines connected via networks), the ability to run an algorithm in parallel is more important than ever.

Of course, not all problems can be parallelized to the same degree. Broadly, problems lie on a spectrum between two extremes. (In truth, most problems lie somewhere between these extremes, but the best way to understand what's possible is to know what the extremes are.)

Now consider all of the sorting algorithms we learned about in our conversation about Comparison-Based Sorting. For which of the algorithms would you expect a multicore processor to be able to solve the problem significantly faster than a single-core processor would be able to solve it? For each of the ones that would benefit, briefly explain, in a sentence or two, why multiple cores would be beneficial. (There's no need to do any heavy-duty analysis here; we just want to see if you've got a sense of which might benefit from parallelism and why.)

What to submit

Add one PDF file to your core directory with this name: problem4.pdf, which contains your answer to this question.


Problem 5 (2 points)

Background

The sorting algorithms we saw this week were all Comparison-Based Sorting algorithms, which means that they do their work by comparing pairs of elements and taking action on the basis of those comparisons. Many of these algorithms reposition the elements primarily by swapping them with each other.

Now suppose you had the following C++ implementation of insertion sort, which is similar to what we saw in the notes, except for two changes:

template <typename Element, typename Comparer>
void insertionSort(Element* begin, Element* end, Comparer shouldBeBefore)
{
    for (Element* i = begin + 1; i != end; ++i)
    {
        for (Element* j = i; j != begin && shouldBeBefore(*j, *(j - 1)); --j)
        {
            std::swap(*j, *(j - 1));
        }
    }
}

The changes have made insertionSort a much more broadly-useful implementation.

// Sorting integers in descending order
int a[8] = { 3, 7, 6, 1, 2, 8, 5, 4 };
insertionSort(a, a + 8, [](int i, int j) { return i > j; });

// Sorting strings in ascending order of their length
std::string s[5] = { "Boo", "is", "good", "and", "perfect" };
insertionSort(s, s + 5, [](const std::string& i, const std::string& j) { return i.size() < j.size(); });

The question

It turns out that this function template would have been legal both before and after C++11 was released. As we saw, though, C++11 added Move Semantics to the language. In a couple of sentences, briefly explain the impact, if any, that move semantics has had on the run-time performance of this insertionSort function template, relative to the performance that we would have seen before C++11.

What to submit

Add one PDF file to your core directory with this name: problem5.pdf, which contains your answer to this question.


Deliverables

In Canvas, you'll find a separate submission area for each problem. Submit your solution to each problem into the appropriate submission area. Be sure that you're submitting the correct file into the correct area (i.e., submitting your Problem 1 solution to the area for Problem 1, and so on). Under no circumstances will we offer credit for files submited in the incorrect area.

Submit each file as-is, without putting it into a .tar.gz file or arranging it in any other way. (There is a gather script in the set7 template, but there's no need to use it.) If we asked for a PDF, for example, all we want is a PDF; no more, no less. If you submit something other than what we asked for, we will not be offering you any credit on the submission. There are no exceptions to this rule.

Of course, you should also be aware that you're responsible for submitting precisely the version of your work that you want graded. We won't regrade an exercise simply because you submitted the wrong version accidentally, and we won't be able to offer any credit on exercises that you forgot to submit.

Can I submit after the deadline?

Unlike some of the projects in this course, the reinforcement exercises cannot be submitted after the deadline; there is no late policy for these. Each is worth only 3% of your grade, so it's not a disaster if you miss one of them along the way.

Can I work on this outside of the VM?

Yes, but be aware that you must submit the files in the appropriate format, and that any C++ code you submit must compile and run on the ICS 46 VM.